Artificial Intelligence and Machine Learning have emerged as some of the hottest technologies during the past couple of years. Major enterprises are going to great lengths to incorporate AI and ML into their products and operations, while vendors are rushing to develop the platforms to support those AI and ML deployments. Many organizations that used to bill themselves as “Big Data” vendors have rebranded themselves as AI and Machine Learning companies. This makes sense, as the recent proliferation of AI and ML is only possible as a result of the rise of Big Data during the internet era.
To leverage AI and ML, organizations need to collect a massive quantity of data to train algorithms. AI and ML are only possibly because of the broad, concerted efforts to collect big data that have been ongoing for over a decade. But simply amassing Big Data isn’t enough – to build working AI and ML models from that data, data must be stored in a way that leverages rich metadata.
Legacy Storage Leverages Little Metadata
Big Data is typically kept in file storage systems comprised of legacy systems. These systems are labyrinths: Data Scientists struggle to find the right data to use for their AI and ML models due to the difficulty of searching and locating data in these massive systems that have limited metadata. The ability to collect metadata is determined by storage architecture, and traditional file systems don’t gather much of it.
The lack of metadata makes it difficult to determine what is in a file – e.g., contents of a video or image, where that file is from geographically and other identifiers that are critical to giving it context. As a result, finding data in a file system is especially difficult in industries such as media and entertainment and healthcare that rely heavily on video and other images, which are challenging to search.
Object Storage Employs Rich Metadata
Object storage is a newer storage architecture that leverages metadata in ways file storage doesn’t. While traditional file storage defines data with limited metadata tags (file name, date created, date last modified, etc.) and organizes it into different folders, object storage defines data with unconstrained types of metadata and locates it all from a single API, searchable and easy to analyze. For example, a traditional X-ray file would only have metadata describing basics like creation date, owner, location and size. An X-ray object, on the other hand, could use metadata that identifies patient’s name, age, injury details and which area of the body was X-rayed, making it much easier to locate via search.
In object storage systems, metadata is customizable, which means users can input a lot more identifying information for each piece of data. Instead of fixed tags, custom metadata allows new concepts to be captured, enabling robust searchability. In the X-ray example, this means that metadata can be added to identify diagnostic information so a doctor can easily determine whether and where there was a previous injury. As another example, consider a broadcaster managing its growing video archive: IT staff can customize metadata to identify its news programs by the anchor.
Object storage provides the foundation for a range of AI and ML use cases today, such as advanced surveillance applications deployed in smart cities, targeted advertising applications from marketers and quality inspection applications used by manufacturers. In the surveillance example, object storage is employed to support pattern-detection apps that recognize faces, logos, landmarks, and other categories of content, which use metadata to describe attributes like colors, sizes, gender, location, etc. so it’s easy to find the right images and videos.
This extensive use of metadata is only possible with object storage, which makes it the best storage architecture for supporting AI and ML. As a result, organizations looking to realize the full value of AI and ML applications would be wise to consider the storage component in their infrastructure decisions rather than focusing strictly on compute requirements.