{"id":5526,"date":"2025-05-17T07:09:56","date_gmt":"2025-05-16T23:09:56","guid":{"rendered":"https:\/\/cicserver.com\/vast-data-adds-vector-search-and-deepens-google-cloud-ties-blocks-and-files\/"},"modified":"2025-05-17T07:09:56","modified_gmt":"2025-05-16T23:09:56","slug":"vast-data-adds-vector-search-and-deepens-google-cloud-ties-blocks-and-files","status":"publish","type":"post","link":"https:\/\/cicserver.com\/de\/vast-data-adds-vector-search-and-deepens-google-cloud-ties-blocks-and-files\/","title":{"rendered":"VAST Data adds vector search and deepens Google Cloud ties \u2013 Blocks and Files"},"content":{"rendered":"<p><br \/>\n<\/p>\n<div>\n            <!-- image --><\/p>\n<div class=\"td-post-featured-image\"><a href=\"https:\/\/blocksandfiles.com\/wp-content\/uploads\/2025\/05\/VAST-teaser.jpg\" data-caption=\"\"><img fetchpriority=\"high\" decoding=\"async\" width=\"696\" height=\"435\" class=\"entry-thumb td-modal-image\" src=\"https:\/\/blocksandfiles.com\/wp-content\/uploads\/2025\/05\/VAST-teaser-696x435.jpg\" srcset=\"https:\/\/blocksandfiles.com\/wp-content\/uploads\/2025\/05\/VAST-teaser-696x435.jpg 696w, https:\/\/blocksandfiles.com\/wp-content\/uploads\/2025\/05\/VAST-teaser-300x188.jpg 300w, https:\/\/blocksandfiles.com\/wp-content\/uploads\/2025\/05\/VAST-teaser-768x480.jpg 768w, https:\/\/blocksandfiles.com\/wp-content\/uploads\/2025\/05\/VAST-teaser-672x420.jpg 672w, https:\/\/blocksandfiles.com\/wp-content\/uploads\/2025\/05\/VAST-teaser.jpg 950w\" sizes=\"(max-width: 696px) 100vw, 696px\" alt=\"\" title=\"VAST teaser\"\/><\/a><\/div>\n<p>            <!-- content --><\/p>\n<p><a href=\"https:\/\/blocksandfiles.com\/2025\/02\/19\/vast-data-expands-into-block-storage-and-event-broking\/\">VAST Data<\/a> has added vector search to its database and integrated its software more deeply into Google\u2019s cloud.<\/p>\n<p>The database is part of its software stack layered on top of its DASE (Disaggregated Shared Everything) storage foundation along with the Data Catalog, DataSpace, unstructured DataStore and DataEngine (InsightEngine). Generative AI large language models (LLMs) manipulate and process data indirectly, using hashed representations \u2013 vector embeddings or just vectors \u2013 of multiple dimensions of an item. An intermediate abstraction of word in text documents is a token. These are vectorized and a document item\u2019s vectors are stored in a multi-dimensional space with the LLM searching for vectors as it computes steps in its generation of a response to user requests. This is called semantic searching.<\/p>\n<p>A VAST Data <a href=\"https:\/\/www.vastdata.com\/blog\/introducing-vast-vector-search-real-time-ai-retrieval-without-limits\">blog<\/a> by Product Marketing Manager Colleen Quinn says: \u201cVector search is no longer just a lookup tool; it\u2019s becoming the foundation for real-time memory, context retrieval, and reasoning in AI agents.\u201d<\/p>\n<p>Vectors are stored by specialized vector database suppliers \u2013 think <a href=\"https:\/\/blocksandfiles.com\/2025\/01\/23\/pinecone-assistant-builds-rag-ai-agents\/\">Pinecone<\/a>, <a href=\"https:\/\/blocksandfiles.com\/2024\/12\/03\/weaviate-launches-vector-embeddings-generation-service\/\">Weaviate<\/a> and <a href=\"https:\/\/blocksandfiles.com\/2023\/06\/14\/zilliz-ai-vector-database-free-tier\/\">Zilliz<\/a> \u2013 and are also being added as a data type by existing database suppliers. Quinn says that the VAST Vector Search engine \u201cpowers real-time retrieval, transactional integrity, and cross-modal governance in one platform without creating new silos.\u201d\u00a0<\/p>\n<p>In the VAST world, there is a single query engine, which can handle SQL and vector and hybrid queries. It queries VAST\u2019s unstructured DataStore and the DataBase, where vectors are now a standard data type. Quinn says: \u201cVector embeddings are stored directly inside the VAST DataBase, alongside traditional metadata and full unstructured content to enable hybrid queries across modalities, without orchestration layers or external indexes.\u201d<\/p>\n<p>\u201cThis native integration enables agentic systems to retrieve memory, reason over metadata, and act \u2013 all without ETL pipelines, external indexes, or orchestration layers.\u201d<\/p>\n<p>\u201cThe system uses sorted projections, precomputed materializations, and CPU fallback paths to maintain sub-second performance \u2013 even at trillion-vector scale. And because all indexes live with the data, every compute node can access them directly, enabling real-time search across all modalities \u2013 text, images, audio, and more \u2013 without system sprawl or delay.\u201d<\/p>\n<p>\u201cAt query time, VAST compares the input vector to all stored vectors in parallel. This process uses compact, columnar data chunks to prune irrelevant blocks early and accelerate retrieval.\u201d<\/p>\n<p>\u201cFuture capabilities will expand beyond vector search, enabling new forms of hybrid reasoning, structured querying, and intelligent data pipelines.\u201d Think multi-modal pipelines and intelligent data preparation.<\/p>\n<h3 class=\"wp-block-heading\">Google Cloud<\/h3>\n<p>Building on its <a href=\"https:\/\/blocksandfiles.com\/2024\/04\/11\/vast-data-google-cloud\/\">April 2024 announcement<\/a> that it had ported its Data Platform software to Google\u2019s cloud, enabling users to spin up VAST clusters there, VAST has now gone further. It says its Data Platform \u201cis fully integrated into Google Cloud \u2013 offering a unified foundation for training, retrieval-augmented generation (RAG), inference, and analytics pipelines that span across cloud, edge, and on-premises environments.\u201d<\/p>\n<p>Renen Hallak, VAST founder and CEO, spoke of a \u201cleap forward,\u201d stating: \u201cBy combining the elasticity and reach of Google Cloud with the intelligence and simplicity of the VAST Data Platform, we\u2019re giving developers and researchers the tools they need to move faster, build smarter, and scale without limits.\u201d<\/p>\n<p>The additional VAST facilities now available on GCP include:<\/p>\n<ul class=\"wp-block-list\">\n<li>InsightEngine enabling developers and researchers to run data-centric AI pipelines\u2014such as RAG, preprocessing, and indexing\u2014natively at the data layer.<\/li>\n<li>DataSpace with its exabyte-scale global namespace which connects data on-premises, at the edge, and in Google Cloud as well as other hyperscalers for data access and mobility.<\/li>\n<li>Unified file (NFS, SMB), object (S3), block, and database access.<\/li>\n<\/ul>\n<p>VAST says customers can run AI, ML, and analytics initiatives without operational overhead and\u00a0unify their AI training, RAG pipelines, high-throughput data processing, and unstructured data lakes on its single, high-performance platform.<\/p>\n<p>The base VAST software has already been ported to AWS, with <a href=\"https:\/\/aws.amazon.com\/marketplace\/search\/results?searchTerms=VAST+Data\">v5.2<\/a> available in the AWS Marketplace. We understand v5.3 is the latest version of VAST\u2019s software.\u00a0<\/p>\n<p>There is limited VAST availability on the <a href=\"https:\/\/azuremarketplace.microsoft.com\/en-us\/marketplace\/apps\/vastdata1650451243415.vastdata-lb-marketplace?tab=Overview\">Azure Marketplace<\/a>, where \u201cVAST\u2019s virtual appliances on Azure allow customers to deploy VAST\u2019s disaggregated storage processing from the cloud of their choice. These containers are free of charge and customers interested in deploying Universal Storage should contact VAST Data to get their capacity under management. This product is available as a Directed Availability release.\u201d<\/p>\n<h3 class=\"wp-block-heading\">Comment<\/h3>\n<p>With its all-in-one storage and AI stack, VAST Data is becoming the equivalent of a software AI system infrastructure mainframe environment, built from modular storage hardware boxes, NMVe RDMA links to x86 and GPU compute, not forgetting Arm (BlueField). Both compute and storage hardware are commodities for VAST. But the software is far from a commodity. It is VAST\u2019s core proprietary IP, and being developed and extended at a high rate, with a promise of being uniformly available across the on-premises environment and the AWS, Azure, and Google clouds. For better or worse, as far as we are aware, no other storage nor system data infrastructure company is working on such a broad and deep AI stack at the same pace.<\/p>\n<\/p><\/div>","protected":false},"excerpt":{"rendered":"<p>VAST Data has added vector search to its database and integrated its software more deeply into Google\u2019s cloud. The database is part of its software stack layered on top of its DASE (Disaggregated Shared Everything) storage foundation along with the Data Catalog, DataSpace, unstructured DataStore and DataEngine (InsightEngine). Generative AI large language models (LLMs) manipulate [&hellip;]<\/p>","protected":false},"author":3,"featured_media":5527,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_seopress_robots_primary_cat":"","_seopress_titles_title":"","_seopress_titles_desc":"","_seopress_robots_index":"","footnotes":""},"categories":[1],"tags":[],"class_list":{"0":"post-5526","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-blog"},"_links":{"self":[{"href":"https:\/\/cicserver.com\/de\/wp-json\/wp\/v2\/posts\/5526","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cicserver.com\/de\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cicserver.com\/de\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cicserver.com\/de\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/cicserver.com\/de\/wp-json\/wp\/v2\/comments?post=5526"}],"version-history":[{"count":0,"href":"https:\/\/cicserver.com\/de\/wp-json\/wp\/v2\/posts\/5526\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/cicserver.com\/de\/wp-json\/wp\/v2\/media\/5527"}],"wp:attachment":[{"href":"https:\/\/cicserver.com\/de\/wp-json\/wp\/v2\/media?parent=5526"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cicserver.com\/de\/wp-json\/wp\/v2\/categories?post=5526"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cicserver.com\/de\/wp-json\/wp\/v2\/tags?post=5526"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}