INTELLISOPHIC

DEEP MEANING AI

Mission: To turn human knowledge into machine usable data.

Intellisophic announces the beta product release of SAM-1: the world’s first semantic AI model (SAM) that aligns, and fact checks large language models (LLMs).

Paoli, PA – [04/15/2024] – Intellisophic, a leading provider of semantic data and software, is pleased to announce the release of SAM-1, the world’s first semantic AI model designed to assist large language model developers in training and assuring the quality of their LLM products.

Semantic AI, also known as the semantic web, is an area of research that focuses on enhancing the capabilities of artificial intelligence systems by incorporating semantic knowledge and understanding into their operation. With SAM-1, Intellisophic aims to revolutionize the field of AI by enabling machines to comprehend and reason about the meaning of information in a manner similar to human understanding.

At its core, SAM-1 leverages semantic technologies to enable AI systems to gain a deeper understanding of data and its context. By representing knowledge in a structured and machine-readable format using ontologies, knowledge graphs, and semantic networks, SAM-1 empowers machines to perform more intelligent tasks such as information retrieval, natural language processing, knowledge discovery, automated reasoning, and decision-making.

Intellisophic has built risk mitigation solutions in national security, health, legal and regulatory sectors.  SAM-1 is our solution to the risk of LLM. LLM developers can now tap into the power of semantic AI to create more meaningful and intelligent interactions between humans and machines.

“Intellisophic has always been at the forefront of semantic AI innovation,” said George Burch, CEO of Intellisophic. “With the release of SAM-1, we are paving the way for the future of AI development and enabling developers to unlock the full potential of large language models.”

About Intellisophic:

For over twenty-five years, Intellisophic has been a trusted provider of semantic data and software to mitigate corporate and regulatory risks in legal, health, national security, and government sectors. Their proprietary DeepMeaning AI stack mines information from various content sources, including the billions of websites on the World Wide Web.

Intellisophic’s private knowledge graph, built on a semantic AI model (SAM) to W3C Web 3.0 standards, contains millions of concepts that humans use to understand and reason with. This vast knowledge graph is the foundation of SAM-1, ensuring its accuracy and alignment of large language models.

SAM-1: The  LLM Quality Control Partner.

SAM-1 is designed to assist large language model developers in training and assuring the quality of their LLM products. It augments LLM training with a deep factual understanding of the concepts that humans use to understand and reason with.

SAM-1 models knowledge in a structured and machine-readable format using ontologies, knowledge graphs, and semantic networks. SAM-1 empowers machines to perform more intelligent tasks such as information retrieval, natural language processing, knowledge discovery, automated reasoning, and decision-making.

What sets SAM-1 apart is scale. Its unbounded knowledge graph is built using a proprietary knowledge extraction algorithm that builds knowledge graphs much faster than manual or statistical clustering methods used by other semantic AI developers SAM-1 also stores facts from authoritative sources for use in reasoning or fact checking.  

The specific LLM issues SAM-1 addresses are training data quality, hallucinations and factual errors.

Training data quality.

The training paradigm SAM-1 supports is a subnetwork of billions of sentences with known source and ownership attributes organized into fine grain domain specific subnetworks. SAM-1 preprocesses the sentences into RDF triples before tokenization creating a conceptual framework based on generally accepted gold-standard knowledge corpora. This data is licensed from private publishers of journals and reference corpora to protect our customers from legal liability outside fair use. The SAM-1 open-source information access is on the same scale as LLM models, including CommonCrawl’s 3 billion websites. What sets us apart is the knowledge that is used to fact check public sources.

The scale and quality of the SAM-1 subnetwork training data provides the following benefits to LLM developers:

  1. Improved Representation Learning: Sub-networks can specialize in learning specific patterns or representations within the data.
  2. Efficient Parameter Sharing: Sub-networks can share parameters across different parts of the model.
  3. Reducing Vanishing Gradient Problem: Sub-networks can help to mitigate the vanishing gradient problem, which can occur in deep neural networks when gradients become very small during backpropagation.
  4. Improved Generalization: By allowing different sub-networks to specialize in different aspects of the task, the model can often achieve better generalization.
  5. Better Exploitation of Data Parallelism: Sub-networks can be trained in parallel, which can help to take advantage of modern hardware architectures. This can significantly speed up the training process and make it more efficient.

A further benefit is that the SAM-1 training data is identified and specifically licensed for use over a broad range of applications. This benefit mitigates many of the claims of illegal text appropriation in the training itself.

Hallucinations and factual errors. 

SAM-1 uses the cloud architecture Elastic MapReduce (EMR) to extract concepts from text in a Hadoop cluster.  When the LLM text is generated, SAM-1 identifies potential concepts and uses the semantic field ( tokens specific to the concept) to compare to the actual words in the text. The concepts that have few or no  words in common with LLM sentences are considered hallucinations. SAM-1 corrects the LLM by negating the use of the concept. Hallucinations are conceptual errors.

Factual errors are sentences in text that are contradicted by authoritative sources. The knowledge representation of facts for use in automating facr checking is three-part data structure called a Predicate:  SAM-1 has billions of facts available organized by concept.


SAM-1

Intellisophic’s Knowledge Graph (Ontology)

  • World’s largest prebuilt knowledge graph – hundreds of subject areas and millions of topics.
  • Over 3 billion websites have been indexed by topic.
  • Knowledge graphs are validated by recognized subject matter experts.
  • Billions of facts are represented as subject-predicate-object triples for fact checking.
  • Semantic AI model development – SDK and API For applications developers.

Knowledge Graph Applications

  • Apply business analytics to knowledge data.
  • Know where your markets are heading.
  • Know what your competitors know (or don’t know).
  • Understand and measure your knowledge assets.
  • Control and tame IT sprawl.
  • Steer and align large language models (LLM)

Knowledge Graph Customers

  • Supplier of knowledge graphs to the worlds leading companies.
  • Enterprise scale knowledge extraction for legal and pharma industries.
  • Supplier of knowledge graphs to national security and regulatory agencies.


CONTACT US

Serving LLM developers, Resellers, OEM and Enterprise Solution Providers.


(C) 2019-2024 Intellisophic – All rights reserved

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.