statsmodels.datasets.get_rdataset(dataname, package='datasets', cache=False)

download and return R dataset


dataname : str

The name of the dataset you want to download

package : str

The package in which the dataset is found. The default is the core ‘datasets’ package.

cache : bool or str

If True, will download this data into the STATSMODELS_DATA folder. The default location is a folder called statsmodels_data in the user home folder. Otherwise, you can specify a path to a folder to use for caching the data. If False, the data will not be cached.


dataset : Dataset instance

A statsmodels.data.utils.Dataset instance. This objects has attributes:

* data - A pandas DataFrame containing the data
* title - The dataset title
* package - The package from which the data came
* from_cache - Whether not cached data was retrieved
* __doc__ - The verbatim R documentation.


If the R dataset has an integer index. This is reset to be zero-based. Otherwise the index is preserved. The caching facilities are dumb. That is, no download dates, e-tags, or otherwise identifying information is checked to see if the data should be downloaded again or not. If the dataset is in the cache, it’s used.

Previous topic


Next topic


This Page