Tag Archives: Datasets

Recommended: Making it easier to discover data sets

I heard about this from the UMKC Bioinformatics twitter feed. Google has a blog entry highlighting a new search feature they’ve developed, Dataset Search. It lets you find interesting data sets using standard Google search criteria. The system only works if people on the web provide reasonable documentation of their data sets. I’ve not had a chance to work with this yet, but it looks interesting. Continue reading

Recommended: Adherence to Methodological Standards in Research Using the National Inpatient Sample

I normally don’t recommend articles that are stuck behind pay walls, but this is an important article. It shows how 85% of a sample of research studies using the National Inpatient Sample database failed to follow at least one of seven well documented practice recommendations of the Agency for Healthcare Research and Quality. Continue reading

Recommended: Medicare Claims Synthetic Public Use Files (SynPUFs)

The Centers for Medicare & Medicaid Services (CMS) provides researchers with access to Medicare claims data, which is a wonderful resources. But you have to sign a restrictive agreement before they will give you this data and you have to pay a non-trivial amount of money to get the data. Fair enough, because CMS has to guarantee patient confidentiality among other things. But what if you want to “play” with the data before taking the plunge? Thankfully, CMS has provided to the general public a synthetic (read fake) data set that has the same data structure. This allows you to prototype your programs on the synthetic data and then transition easily to the real data. Continue reading