Human Faces - detection, landmarks, identities


My first contribution to SNet will be a set of services for detecting faces in images and video.

I picked this as a great example of a micro-ecosystem for how SNet services will eventually be composable and be able to rely on one another.

Before you can get a 128-D vector representing the identity of a face you need to:

  1. find face bounding boxes in an image
  2. detect facial keypoints
  3. align (or normalise) the face to be both frontal and upright

Only then can you figure out the identity reliably. However these previous steps can be useful to other services, and not only that… there are different algorithms, with different performance characteristics, for each of them.

At the moment I have essentially just put a wrapper around the amazing dlib (Davis is one of my heros) and OpenCV algorithms.

In future this repo might expand to allow a face morphing service, do emotion detection, or anything else that can benefit from localising where faces and the parts of faces are in an image.


This is awesome! There are so many applications for this technology. Do you have a sort of roadmap for what the next steps are in this project? :blush:


I do not want to overuse your existing project. But how long could it take
to make this a content storage system. So I mean an inventory program which can do:

[face tracking]+[hand tracking]+[article tracking]+
[gesture tracking]+
[Person&&group tracking]

when all data for that project would be well prepared?
I think that’s the biggest obstacle as i know.

And on how many examples each algorithm might need to learn?


There are a few projects that attempt solutions to each of those problems. Among the most popular open source projects, there is:

But the license is for non-commercial use unless you contact them about a commercial license (One reason dlib is so awesome is that it’s license is quite permissive).

SNet is meant to be a distributed network where services can rely on one another, so I could imagine that once these individual problems have associated services to solve them, you could have higher level services that will aggregate image annotations and then also track them in video streams.


Wow, that looks great!
Perfect that’s exactly how precise I anticipated it for that idea I have in mind as shown in the clip.
In terms of when it comes to received datapoints.
Very interesting.
So it might be possible with it to have a tracker for example In my house hold which keeps track with all my stuff.
Where is the key, where the remote or who took the last milk.
Like snow white.


There’s a huge world of research around human faces. From full 3d reconstruction from an image, to morphing faces to look like someone else, and detecting people’s emotion!

I’ve integrated all I was initially planning to add for now, but as I come across research (or pull out previous papers I’ve read) I may add them to this issue:

If people want to see a specific-model related to faces wrapped as a SNet service, they should feel free to create a new issue in the repo (or open a pull request :wink: )


dlib has a great license but the 68 points land mark model doesn’t allow commercial use.
I had to train my own because of this (only 26 points though) if you want, you can get it here.


My interpretation of the licensing for the 68 point approach is that it’s ambiguous. dlib trained a model on the data that doesn’t allow commercial use. But the model itself is not explicitly being restricted. I don’t know of prior legal precedent around the transitivity of licenses and how far it goes. e.g. if you use the dlib model to predict on new data, clean up the predictions, and then train a new model on your own data, does the original data license still apply? What if you repeat the process 100x?

Lawyers I’ve talked to in the past have not given me a definitive answer.

But thanks for sharing your 26 point model! That’s really great :slight_smile: dlib also have their own custom 5-point model without the license restriction.


hehe that’s actually what I did, just used the 68 points model to annotate my own dataset and train again… limited it to 26 points for a smaller model with only necessary points(it’s faster too).
don’t think it will be tied to their license

anyway here it explicitly says that “… the trained model therefore can’t be used in a commerical product”


Yeah, that’s what the dataset creator asked Davis to say, you’ll see that he suggests you talk to them or talk to a lawyer. And I think that’s because the transitivity of licenses between datasets and machine learning models hasn’t been well established in law.

At the moment the face-services repo is just a proof of concept. I will either disable the 68 keypoint model if SNet starts exchanging real value, integrate your model, or create my own 68 point face dataset (and then release it for anyone to use :wink: )




I actually have a facial annotation service and can make you the facial landmarks, I just joined here 2 minutes ago, so I am really glad to see this post!

Check out the service…