• Latest
Types of Data: A High Level Explanation

Types of Data: A High Level Explanation

December 23, 2022
Nintendo Switch System Update 16.0.1 Is Now Live, Here Are The Full Patch Notes

Nintendo Switch System Update 16.0.1 Is Now Live, Here Are The Full Patch Notes

March 23, 2023
How to make MKBHD logo ( Marques Brownlee) in Adobe illustrator CC 2020 – TUTORIAL for beginners .

How to make MKBHD logo ( Marques Brownlee) in Adobe illustrator CC 2020 – TUTORIAL for beginners .

March 23, 2023
OnePlus One Review!

OnePlus One Review!

March 23, 2023
Microsoft Scrapped a PS5 Version of Redfall, Says Arkane Director

Redfall Isn’t Using Unreal Engine 5 After All

March 23, 2023
CloudWatch Custom Log Metric Alarm Notification

CloudWatch Custom Log Metric Alarm Notification

March 23, 2023
Most Overrated Tech? 8K Ask MKBHD 2020!

Most Overrated Tech? 8K Ask MKBHD 2020!

March 23, 2023
Apple further cracks down on remote work

Apple further cracks down on remote work

March 23, 2023
Tesla's Wireless Charger: Explained!

Tesla's Wireless Charger: Explained!

March 23, 2023
Introduction to Container Orchestration – DZone

Introduction to Container Orchestration – DZone

March 23, 2023
Samsung Galaxy M54 silently goes official as an amped up version of the A54

Samsung Galaxy M54 silently goes official as an amped up version of the A54

March 23, 2023
Casey Neistat Studio Tour!

Casey Neistat Studio Tour!

March 23, 2023

Apple Watch Series 8 vs 7 and earlier

March 23, 2023
Advertise with us
Thursday, March 23, 2023
Bookmarks
  • Login
  • Register
GetUpdated
  • Game Updates
  • Mobile Gaming
  • Playstation News
  • Xbox News
  • Switch News
  • MMORPG
  • Game News
  • IGN
  • Retro Gaming
  • Tech News
  • Apple Updates
  • Jailbreak News
  • Mobile News
  • Software Development
  • Photography
  • Contact
No Result
View All Result
GetUpdated
No Result
View All Result
GetUpdated
No Result
View All Result
ADVERTISEMENT

Types of Data: A High Level Explanation

December 23, 2022
in Software Development
Reading Time:5 mins read
0 0
0
Share on FacebookShare on WhatsAppShare on Twitter


In this article, I will talk about the different types of data. So, as some of you might be aware, data can be broken down into different types. One such categorization which is very useful when you are building a machine learning pipeline is the following: structured data, semi-structured data, and unstructured data.

So What Is the Difference Between These Types of Data?

1_q3ZwC-L-4lVPmNYyDjeabw.png

Structured Data

This term refers to data that is organized in a tabular format or in something like a relational database which organizes data in multiple tables which can then be joined together. So structured data presents the easiest type of data to work with. If your data is stored in an SQL database, for example, then most data scientists will find it pretty easy to access the database and then extract insights from the data. That being said, not all databases are created equal. Some databases might be organized in a very bad manner; other databases might be organized in a very easy-to-use manner. But all things being equal, structured data is easy to work with.

If you look deep down into how machine learning pipelines are created, you always need structured data. So even if you have data that is in the same structured or structured format, what algorithms do internally is they steal, they digest this data, and then they transform it into a structured format.

Unstructured, Semi-structured data, and structured data 

Semi-Structured Data

Semi-structured DataThis term refers to data that is not completely organized but not disorganized either. Good example of this is HTML, JSON, and XML. For those who are familiar with HTML or JSON, if you’re not, it’s very easy to Google JSON and see an example of what a JSON file looks like. You’ll very quickly see that JSON seems to follow some kind of structure, and it’s the same for HTML. You see something which looks like code, but then again, the JSON or HTML are not fully structured, so they’re not organized in the table.

An HTML file or Adjacent file can look very different from some other HTML or JSON file. This means that there are certain freedoms that the developers of those files take, and this can make it somewhat challenging to work with them.

How Do Data Scientists Collect Data From Different Sources?

A data scientist will have to extract information from the semi-structured data and then restructure it into a tabular format. The challenge here is that there are usually many ways to do that. And this step data can be quite time-consuming depending on the kind of data and how the data is organized.

In general, I’m not a huge fan of semi-structured data. As a data scientist, I prefer structured data. Like most data scientists, however, semi-structured data is very useful in domains like social media. Social media is full of text data, image data, video data, and data formats like JSO. Let us store this data alongside meta information.

So, you can store a video, let’s say, and then you can store who created this video, comment around this video, etc. This is easier to do using JSON than using SQL, for example. Therefore, semi-structured formats have become so popular in the last ten years. Semi-structured data quite often goes hand in hand with no SQL databases and big data.

Unstructured Data

This term refers to data where there is clearly no structure. For example, data set that consists only of images or videos, or audio is an example of an unstructured data set. So, information in an unstructured data set does not follow a preexisting data model. And this makes it quite challenging to work with because someone might have to go through all the data and understand whether some of the data is potentially noisy or have some other issues which are going to prevent a machine-learning pipeline from being successfully built.

In most cases, unstructured data in the real world is usually you’re going to encounter it in two situations.

It’s either some sort of open data set or a machine learning competition where someone curates an unstructured data set. You must use this data and try to predict whether a photo contains humans or animals as best as you can. Or the other case where you might encounter structured data is when a data strategy was not designed. Somehow, a company ended up having structured data instead of semi-structured data. Because really, in most scenarios, we expect to see this data alongside some meta information, like when this video showed up, who posted this if we’re talking about social media.

Unstructured vs Structured

How Does a Data Scientist Digest this Type of Data?

I would expect that, in most cases, most of the data should be semi-structured. There are still cases where data might just be unstructured because there is not so much that we can do about it. For example, in customer support, maybe a data set consists of questions and responses. You want to build a bot based on those questions and responses so it can automatically produce answers to different queries.

Well, in this case, probably there’s not much you can do to structure the data. In one way or another, you will have to end up with an unstructured data set. But unstructured data, even though it is challenging, quite often can still be successfully analyzed.

In most cases, we’re using deep learning. There are deep learning algorithms in order to digest this kind of data. And deep learning has been very successful with data like audio data, natural language data, images, and all this sort of stuff. Regarding these, I’ve worked in sports analytics in creating predictive models for football injuries and recovery after injuries; I’ve worked in financial predictions and studied the application of deep learning in manufacturing. The results are very encouraging.  

Conclusion

This was a summary of the different types of data that you can encounter in the business recap.

We talked about structured data, the same as structured data and unstructured data. Structured data is usually the low-hanging fruit for a business. And ideally, as a business, you want to have a data strategy that ensures that most of your data is stored in a structured format. The reason is that this makes the life of data scientists much easier, and they will be able to spend more time on valuable tasks instead of just data wrangling.

Schema structured data and unstructured data have started to become to grow in the last 10-15 years. It’s the era of big data, after all. But in most cases, you should try to turn structured data and semi-structured data. And once again, semi-structured data is a difficult topic because of the kind of database you need to choose and how you should organize the different fields, and for what purpose.



Source link

ShareSendTweet
Previous Post

20 must-play indie gems available with the PlayStation Plus Game Catalog – PlayStation.Blog

Next Post

Game Informer’s Top 10 Games Of 2022

Related Posts

CloudWatch Custom Log Metric Alarm Notification

March 23, 2023
0
0
CloudWatch Custom Log Metric Alarm Notification
Software Development

Amazon CloudWatch metric alarm allows customers to watch a metric value, or a math expression value for the metric, and...

Read more

Introduction to Container Orchestration – DZone

March 23, 2023
0
0
Introduction to Container Orchestration – DZone
Software Development

Container orchestration is a critical aspect of modern software development, enabling organizations to deploy and manage large-scale containerized applications. In...

Read more
Next Post
Game Informer’s Top 10 Games Of 2022

Game Informer's Top 10 Games Of 2022

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

© 2021 GetUpdated – MW.

  • About
  • Advertise
  • Privacy & Policy
  • Terms & Conditions
  • Contact

No Result
View All Result
  • Game Updates
  • Mobile Gaming
  • Playstation News
  • Xbox News
  • Switch News
  • MMORPG
  • Game News
  • IGN
  • Retro Gaming
  • Tech News
  • Apple Updates
  • Jailbreak News
  • Mobile News
  • Software Development
  • Photography
  • Contact

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?