- An Introduction to Data
- Relationships and the Data Story
- The Limitations of Data
- Tabular Data Format
- Metadata Introduction
- Accessing Data
What is data? Data are facts and statistics gathered together for reference or analysis. The small screenshot on the left is just a small piece of the Palmer Penguins dataset in Python. There’s nothing special about that dataset other than it’s wee-known among those studying data analytics.
The book Data Science and Big Data Analytics quotes McKinsey & Co., Big data: The Next Frontier for Innovation, Competition, and Productivity, that says: “Big Data is data whose scale, distribution, diversity, and/or timeliness require the use of new technical architectures and analytics to enable insights that unlock new sources of business value.”
Metadata is data about data.
Organizing Your Data
Any time you spend organizing your personal data is time well spent. Align your file naming conventions with your company or organization and follow those. Also talk with the people on your project to confirm those file naming conventions. Create a simple text file for others to see that has the conventions listed with examples. When you are setting those conventions, here are a few things to consider.
- Make your file names meaningful
- Use the project name and the date
- Make the date sortable in a format like YYYYMMDD and version numbers if necessary
- Avoid spaces and special characters in file names
- Use underscores, dashes, capital and small case to avoid special characters and spaces
When using folders, create subfolders in a logical hierarchy. With a hierarchy, put broad concepts at the top and be more specific as you create folders under that broad topic. Separate ongoing with completed work with folder names such as Working and Final. Archive older files in separate folders or in an external storage location. Are your files automatically backed up? If not, manually back them up. The process of creating folders is called soldering. Old folders should eventually be moved to archive folders.