Ontologies for SingularityNET

bengoertzel · May 6, 2018, 1:28am

@akolonin has pointed out to me this site

which contains some public ontologies that potentially can be used/extended for some purposes within SingularityiNET … This seems worthy of community discussion…

akolonin · May 6, 2018, 9:32am

@bengoertzel - http://schema.org is kind of world-wide standard for semantic/structured web markup on HTML pages and any semantics on the web sponsored by Google and K. It is standing in place of complete semantic open source web ontology called Freebase which Google acquired 4 years ago, than renamed to Knowledge Graph than pulled away from public domain to private dungeons In fact, if you want your page to be well indexed and found by Google, you better use that http://schema.org for structured page markup rather than mess with old SEO tricks.

Joel · May 7, 2018, 11:10pm

I guess one question is what the ontology will be used for?

I can see them being useful for interaction with expert and natural language systems, but the schema.org ontology isn’t really sufficient for describing categories as discovered by learning algorithms or for fully describing the constraints/requirements on types of inputs and outputs. Which isn’t surprising as it’s an artificial categorization of reality as opposed to what our services could be learning directly from data.

My intuition says that an ontology should be an opt-in layer of detail. You don’t need it to interact with services, and I think it will be a while before services are intelligent enough to introspect the published ontology/schema of other services to figure out their utility. It’s also pretty heavy-weight, so if you start requiring the details of schema.org in API calls it’s a pretty big burden on new developers.

If we have some clear examples of how it will be used, e.g. only in API responses? Only in reference to entities defined in a canonical SNet database (see below)? Then it might help me (and the community) understand its importance.

In terms of making rapid progress, and making SNet services easy to use by the community, a priority for us should be to create a repository of data types representing common objects. My current preference, as I’ve expressed internally, is for them to be defined in protobuf files.

These common protobuf files can then be easily imported by grpc service definitions, and protobuf will take care of automatically generating client code across multiple languages. The protobuf compiler also has support for extended “options” so when it comes to it, we can annotate them with ontological information as needed. The compiler also will transfer documentation from the protobuf files into the generated client code. Lastly, there are open source proxy servers that can convert from grpc calls to REST endpoints if interacting with grpc isn’t your thing.

Having a common datatype repository naturally provides us a process to allow the community to propose new datatypes by pull request, have these under version control, and have them be historically accessible to developers. A service can publish a git hash or tag of the datatype repo to ensure the correct version is being used, and so long as the client is using a newer version it should be ok (protobuf provides ways of gracefully dealing with new unknown fields, so long as existing fields don’t change their semantic meaning).

The reason for asserting this, even under a post about ontologies, is that I think it’s easier to solve the ontology question after we have a functional network of agents. The applications will become clearer, and agents that need an ontology will come up with one. In the spirit of community-driven projects and open-source software, if a service author comes up with a good representation, other people will see the value and start using it.

Having expressed all that, I do agree a shared ontological grounding is needed for services to interoperate, and a hand-crafted one like schema.org is one option. An alternative to this approach could be an exemplar-based representation of entities. e.g. for image classification, neural-network models will have subtlely different understandings of what a “cat” is. The models could have been trained on different datasets, or even if their training data was the same, their random initialisation of weights before training can result in a different minima being found.

I think the way to share common representations here is to provide example imagery. A model may categorise an image as high probability on output neuron 5, which maps to the symbol “cat”, but the model should also provide an exemplar of a cat. For image classification that would be an rgb image of a canonical, high-scoring, cat image. For the natural language domain, perhaps it would ground itself with an image that’s been highly associated in webpages with sentences about “cats”, or it might show high-frequency word associations for “cat”.

I think this approach to ontology, one of exemplar-based representation, would be more powerful for shared representation and ensuring common semantic meaning even if it is less clean than a predefined and structured view of the world.

Another approach is saying, ok, yes we do want a structured ontology, but we want it applied to a shared database that SNet services can contribute to. We might have the “cat” symbol in the schema.org representation, but an image classification service can push to it and say “I know about cats”, including it’s high scoring cat images. The inverse is that clients can directly ask each service what it knows about the “cat” entity, if anything. If we have a database then we’d need to curate it, but if individual services replied with parts of the data it would be extremely sparsely populated and not necessarily sufficient to ensure two results map to the same entity. The reason it would be sparse, is because schema.org has a lot(!) of fields, and no way are people going to bother filling all of them out when building a service!

To summarise my opinions:

we should avoid the ontology discussion getting in the way of API implementation
the use of an ontology is probably best initially confined to services where it is necessary (i.e. what are we using the ontology for?)
exemplar-based ontologies are more useful for long-term shared understanding between services

Of course, I don’t have quite as much historical context as others in the team, so I might be missing the core reason for using an ontology. But if I’m missing it, others in the community may be as well

cassio · May 8, 2018, 5:22am

I strongly agree with Joel that the best way to think about the ontology is by being able to refer to working service examples. I also think it shouldn’t (and currently doesn’t) get in the way of API design and implementation, and data types are useful guidance for people designing and coding their APIs for integrating their own agents, as long as we preserve some flexibility for innovation there.

Tim · May 9, 2018, 1:32pm

Surely we should just supply the raw building blocks of a self identifiable consciousness. It is not necessary for us to design the blocks with the final constructed form in mind.

I think we will hinder progress by attempting to anticipate what shape those blocks should be, and simply supply the raw base services to offer any emerging agi the freedom and flexibility to take which ever direction it feels best.

Tim · May 9, 2018, 7:05pm

May I ask exactly how at present will an agent look for sub contractor agents to fullfill a service request?

miguelrochefort · December 8, 2018, 2:15pm

Any progress on this?

Where can I find up-to-date information about SingularityNET’s APIs, protocols and ontologies?