Street-fighting mathematics

Don’t you just love that title for a book? It’s by Sanjoy Mahajan, and reminds me of the statistics course I heard of once called “Extreme experiment design” or something similar.
The first chapter is probably the hardest, but also the one of most relevance to statisticians. It’s on dimensions, and includes the actual reference for the reason why the Mars Orbiter crashed. The other chapters contain some useful tools (a tool = a trick I use twice) for problem solving, namely easy cases, lumping, pictorial proof, taking out the big part, and analogy. reminds me of the little book on proof by Plumpton, Shipton and Perry that was one of my first year maths textbooks. It’s advice included ideas like “can you solve a special case? can you solve it for large n or small n?” … Until the final question “can you do anything at all?”!

Comments Off

High-throughput detection of biosecurity agents using high resolution melt analysis, next-generation sequencing and protein profiling

Actually, I think this title is even longer than the previous one! Sorelle Bowman also gave the confirmation seminar for her PhD in Forensic Science on 17 September. The biosecurity agents of her project include such nasties as anthrax and plague, ranging from pure strains grown in controlled laboratories to (hopefully agent-free!) samples collected from airport carpets. I’ll be supervising the statistical aspects of Sorelle’s work, which is likely to include both traditional statistical discrimination methods as logistic regression and linear discriminant analysis, through principal components analysis to deal with high-dimensional data, and machine learning methods like decision trees and SVMs. Some of the questions after the talk highlighted issues we’ll have to address such as sensitivity, and underlying population variance. Also we’ll have to clarify interactions between method and agent (will SVMs work better for anthrax and decision trees for plague, for instance) and combinations of methods (will SVMs work better preceded by a PCA or decision tree, for instance). Brett Lidbury and I addressed these combinations of methods in the context of decision trees for laboratory prediction of hepatitis virus in our 2013 BMC Bioinformatics paper.

Comments Off

The effect of gamma radiation-induced oxidative cell stress on nucler and mitochondrial forensic genotyping

That’s a strong entrant for the talk-with-the-longest-title competition for sure! Corey Goodwin finished his Honours in Forensic Science last year and has embarked on this PhD topic. He gave his confirmation seminar on 17 September. He’ll be comparing nDNA and mDNA damage and forensic profiling capabilities following low and high doses of gamma radiation, using a variety of white-box methods (as in big laboratory machines), as opposed to the so-called black-box methods of algorithmic statistical methods. he seemed to have a good handle on his aims and methods, and I wish him well in his studies.

Comments Off

Past, Present and Future of Statistical Science I

Wow this handy volume from COPSS has 52 chapters! But they’re generally (a little) shorter and lighter than “Statitsics in Action” reviewed in previous posts. This book I’ll also review in parts, as that’s how it is presented? Part I: The history of COPSS contains just one paper, by Ingram Olkin. Entitled “A brief history of COPSS”, it is a succinct history of 50 years, with complete tables of prizewinners and the like.

Comments Off

Removing unwanted variation from high-throughput omic data

Terry Speed is touring the nation with four talks, making up the AMSI-SSAI Lecture Tour. On 26 August he spoke on this topic at ABS House to a big crowd including ABS employees, Statistics Society members and Uni of Canberra students. His take-home message about simple statistical methods providing useful solutions to big modern problems was a really valid one I think. It helps to validate what I’m doing with those Uni students, teaching them a bunch of very classical multivariate statistics methods. Long live principal components analysis!

Comments Off

Goulburn 25

It’s amazing how the four talks at each Goulburn meeting (the latest on August 20) often have a connecting threads despite their disparate natures. The first two talks were Alan Welsh’s talk on compositional data and Walt Davis’s talk on factor analysis. The connecting thread there was an attempt to clarify entrenched positions in the literature and bring some sense to data analysis. Michael Stewart’s talk on two-component mixtures was also reminiscent of Alan’s modes opera do of finding a very simple-looking problem and pursuing carefully or it’s logical conclusion, often with surprisingly deep results. In the traditionally difficult time slot straight after lunch, Bronwyn Loong’s talk on confidentiality was just as mathematically rigorous but had the broadest relevance to providers of data and consumers of data analysis alike.

Hopefully the renovations at Trappers Motel and Conference Centre will be finished soon. The new toilets in the restaurant are certainly very elegant.

Comments Off

Statistics in action part three

Twenty-one chapters about the Canadian contribution to statistical theory and practice. I thought I’d try writing pi-ems about each chapter, where the number of words in each line of the poem is 3, 1, 4, 1, 5, 9. To avoid the posts getting too long, I’ll do them in batches of seven. Here’s the third and final instalment!

Chapter 15. Statistics in financial engineering.
Clear Canadian contributions
Accessible
From Black-Scholes on.
Timely
Financial engineers contribute much and
A statistical formula on its own is not dangerous.

Chapter 16. Making personalised recommendations in e-commerce
Loved the writing:
Timely.
Loved the models too:
Accessible.
Not a specially Canadian issue
But an interesting multivariate problem with websites to boot.

Chapter 17. What do salmon and injection drug users have in common?
Elusiveness, that’s what!
Solution?
Capture-recapture, Lincoln-Peterson
Estimator
And lots of Canadian research.
Unexpected drops in sockeye populations driving much research effort.

18. Capture-recapture methods for estimating the size of a population: dealing with variable capture probabilities
Animals including mice.
People
This time illegal immigrants.
Estimate
With Binomial and Poisson models
And a variety of tools to cope with heterogeneity.

19. Challenges in statistical marine ecology
Counting marine populations:
Tricky.
Hammerhead sharks, Atlantic cod:
Focus.
Zero-inflated? No, hurdle.
Bycatch data yields knowledge thanks to friends of old.

20. Quantifying the human and natural contributions to observed climate change.
Climate change is
real,
graphs and models reveal.
Paradoxically
big data and small sample
come together with distributed modelling and strong assumptions.

21. Data hungry models in a food hungry world: an interdisciplinary challenge bridged by statistics.
Satellite remote sensing
Vegetation
For predicting crop yield.
Six
Indices I’d never heard of.
Nice analytical framework including linear models, forecasting and hindcasting!

Comments Off

Using online media strategically

Deborah Lupton from the News and Media Research Centre at UC gave this talk on 11 August, aimed at early career researchers, but for me it was a useful pretext to share lunch with Deborah and chat about blogs and big data. This talk represented some great common ground! She talked through Academia.edu, ResearchGate, LinkedIn, Google Scholar, Wikipedia, Twitter, Pinterest, Storify, Facebook, Slideshare, podcasts or YouTube videos, and blogs (WordPress and Tumblr; group, individual or guest).

Comments Off

On the misuse, neglect, and nonsense use of epidemiology and effect measures in epidemiological research

Sander Greenland, statistician and epidemiologist from University of California Los Angeles, gave this very provocatively-titled talk on 6 August at NCEPH. Over 50 people attended, and filled the little lecture theatre in the NCEPH buildings.

The talk was based on a paper published in in 2005 in Emerging Themes in Epidemiology 2(4) (www.ete-online.com). His extreme example to get us all laughing at the start was to point out that the burden of death associated with birth is 100%, and that death could therefore be eliminated by preventing all births! He then proceeds to a more realistic, and much more controversial scenario, of decreasing the burden of disease associated with tobacco use. Sander pointed out that smoking cessation is not a real intervention, only smiling cessation programs are, and he also spent quite a bit of time discussing the merits of tobacco replacement products such as the Swedish snus.

I liked his reference to the modularity of statistics (like computer programming), in other words its ability to successfully tackle problems by breaking them up into smaller, more digestible pieces. I was also intrigued by his reference to an interesting exchange of letters in Biometrics in 1988 on the topic of independent competing risks.

His final exhortations to us all revolved around not accepting any policy claim unless you’ve verified it yourself; and taking up the challenge to public ally criticise flawed scientific evidence whenever the opportunity arises. Thud fired up, we all headed out into the crisp Canberra winter morning air.

Comments Off

The Longitude Prize: a new (old) way to stimulate innovation

This post from the Cooperative Research Centres Association newsletter fired up my imagination. I hope it fires up yours!

“The British public has voted for antibiotic resistance research to be the subject of the “first” Longitude Prize. The Longitude Prize 2014 is a prize fund of £10 million to tackle one of the biggest issues facing humanity, with the British public voting for antibiotic resistance over the field that also included flight, food, paralysis, water and dementia. The Longitude Prize 2014 commemorates the 300th anniversary of the Longitude Act of 1714, which was eventually awarded in 1765 to John Harrison for his chronometer (as well as sparking many other innovations).

The announcement of the public vote was made live on the BBC by British Prime Minister, David Cameron last month. The award is administered by the innovation charity Nesta, with the prize fund being put up by the UK’s Technology Strategy Board. Lord Rees, the English Astronomer Royal, chairs the Longitude Committee, which is still working out final rules for awarding the prize.

What a magnificent way to capture the public’s imagination and highlight the importance of innovation.”

Comments Off