HNSSearch and the future of decentralized search

A general talk about the future of decentralized search and its necessity, as well as a presentation on the current state of HNSSearch and future developments.

Transcript

(00:01) [Music] [Applause] [Music] uh [Music] [Music] [Applause] so [Applause] [Music] [Music] [Applause] [Music]

(01:10) all right everybody welcome back uh again thank you to impervious for sponsoring our event uh right now we got andy very very well known in the community definitely one of our cornerstone uh developers always building some cool stuff um so um mike did you want to add anything yeah you’re saying it already i mean he’s super friendly super approachable i mean since since he’s been working on an hms search i’ve been chatting to him on all the different channels from handshake mercenary days to uh yeah with

(01:44) the oog days yeah man yeah i’m having my breakfast even banana nice i am here all right all right i’m really excited for your talk let’s let’s jump right into it all right thank you for that great intro i’m really excited to be here and also thank you for organizing it it’s amazing to be part of handicon 2.

(02:05) i will share my screen and then we can start all right you should see my presentation i will be talking about hns search and the future of decentralized search i want to kind of also tap into the market for search engines and just also discuss what we think that the decentralized search should be so maybe just a bit about me and who we are i started hns search the first version about a year ago it launched in february 2021 uh then we’ve kind of found that hns factory so far where only two developers working on it

(02:51) but our core focus of course so far still hns search which is very much in development right now we are also producing or trying to produce the handshake hunts comics and we are currently working on a comic book which should be released this year where we will show some stories uh from the d-web and just also some pains from being in the h s space that early on but let’s get into search engines so just to start like how a basic or how the basic architecture of a search engine is we have three components the first one

(03:31) is the crawler that’s basically going to the websites gets the content based on the rule set it gets the title the content the images whatever then we have an indexer that that takes this data and then basically prepares it for the database and then the search engine itself which then gives it to the user when you enter your search query so centralized search engines these days are of course the most popular ones and the problem with them is that they first of all they control the crawler so they decide which websites are getting

(04:09) crawled which ones are not they also set the rules for the indexer about what they want you to see or what not they of course manipulate and define the search results i have an example of that just in a second they do store your data and they also sell your data uh storing data is in in one aspect great because it gives you customized search results on the other hand it just means they really know basically everything about you now we have these so-called privacy oriented centralized search engines where in general basically everything is the

(04:52) same except the fact that they don’t store your data and they don’t sell your data but they still control the index or the crawler and they still control what results you see and what not a small example for example if you search on google for handshake you will find that you have about 93 million results that’s really a lot wow but if you actually try to go to the end of the search results you will reach page 24 with a total of 238 results if you then click this nice little link where it says repeat to search with the

(05:32) omitted results you don’t get too much further you will reach just page 47 with about 500 results of course we can argue that if you already go to the second page of any search engine you you went into the dark space of the internet but what i want to show with this example is just the fact that even though there are 93 million results the search engine itself chooses which 500 results they want to present you and that in our opinion is one of the fundamental problems uh because how or why should the search engine

(06:14) decide what you can see and whatnot and google is by far not the only one doing that they’re just one of the only search engines that still shows how many results they have they’re also debating of removing that information because as you can see it is rather pointless so but let’s get into a bit more uh happy thoughts and talk about our vision at h factory and what we think a decentralized search engine should be so of course first of all it should be decentralized in our opinion it should be free open source the crawler should be open

(06:53) source the index should be open source so you can actually see what is being indexed how is it being crawled which information do we gather the index itself should be open it should be usable by anyone even if it’s not in our search engine the front end there should be many front ends that tap into that open index you as a user should have the full control of what you see so we or a decentralized search engine should not pre-select your results of course no data logging and no tracking and in general there should be a network of

(07:30) nodes which sync to that distributed index so as soon as someone logs into the network they will download the latest index and like that the index is again distributed and that makes it of course also censorship resistant which leads me to the benefits of a decentralized search engine of course decentralization there are no monopolies which means like google right now has over 70 percent of market share even more that wouldn’t happen like this anymore you have anonymity no one will know your search queries there is no bias in either crawling or

(08:14) providing the search results no censorship and it would be of course a disruption of the current markets ad networks would have to be rethought and just also normal revenue streams that would have to be rethought so but the talk is also of course about hns search so let’s talk about what we’ve been doing so far we build our platform what we’re building now with h factory on these four uh layers at the bottom we have linda i will talk about her just in a bit she is our indexer and crawler we have d-web pulse which will be

(08:55) released in a couple of weeks two months then our public index and hns search that sits on top of all of this but let’s quickly talk about linda uh she’s our indexer but also our crawler stands for leaving index for navigating decentralized assets she’s at the center of everything so she’s written in gold just a tech uh spec here she controls the crawling the indexing we have built-in machine learning that analyzes the content and categorizes it so it analyzes the text images and sets a category for it

(09:41) she is built with with handshake in mind but she can be extended so you well it’s two lines of code and she will be open to crawl the whole web too or with a bit more effort you can easily crawl other protocols so even though she’s built mainly for handshake there’s nothing much that stops it from forking linda and just creating a normal normal uh a normal search engine so that’s really the foundation without her we have no information and and we can’t really create a index but when we we started with linda and then

(10:25) we had our index and we we directly put hns search on top and we we came into the realization that we are working in the censorship resistance base but the problem is maybe you as the user don’t want to see everything that we crawl because some stuff might not be appropriate for you so we decided to also launch as our next project so before we actually launch it’s hns search uh d-web pulse which is actually an open index which is provided by linda and its categorization but everyone has the opportunity there to

(11:04) also change the categories so when we say this is a blog this is a news website this is an exchange any user that doesn’t think that’s correct can make a request so that the categories changed of course our crawler is also still learning linda still learning what that gives us is a is a public index of hns websites with this content-based categorization and the fact that you as a user can change it this gives us then in return the opportunity to create block lists i know we’re talking about censorship

(11:41) resistance and now i’m talking about block lists but these lists just mean that you as a user or a trusted entity like sky include miami name base whoever can use this information and say block decides with this category and you can set this in the search engine and then when you search you just won’t see those sites which have this specific category of course you can also create your own list but we think we need some kind of mechanism so we are not censoring the content and we are also not censoring what we are indexing but

(12:20) you have the opportunity to actually limit what you see based on your requirements and of course uh other search engines can tap into d-web pulse and use our index and our search results and that’s what we are going to do with hns search once the web pulses is launched we’re gonna also tap into that index so yeah q2 is is the target for the web pulses release where we will release the website the index and also the code for the crawler so let’s get into hns search hns search we it will be divided in two

(13:03) parts so that one part will be the web front end and one part will be the desktop client here i’m talking about the standalone website which will incorporate those block lists i just talked about will be fully open source you can fork the search engine you can fork the index you can fork the crawler you can even swap the index on the website if you wish to hns search comes out of the box usable but if you want you can actually hook it up to your own index or to another provided index but it also gives the toolkit for you to

(13:40) create your own search engine we really want to give you all opportunities to to use the d web and the new internet as you feel fit so if you don’t want to do any customization hns search will be there you use it uh with the hns search domain and that’s it you have to do no customization we take care of everything but if you wish so you can change basically anything and of course the goal is also to make it as user friendly as possible and from the beta which ended in december we took your feedback we went back to a more

(14:22) well to a simpler design and tried to really keep it lean and we brought back dark and light mode the purple was popular with some people but most people wished for purple and dark themes so that is uh scheduled for q3 this year where we released the website the code for the website of course and lists from the web pulse which will be integrated into hns search then for the last topic the desktop client that is really the completion of our vision of a decentralized search engine where you can really you host it on your own machine

(15:07) like you will have full control so you download the node it is a standalone desktop client the index will be stored the public index will be stored on your machine once you log in and you can launch the crawler from within the client so if you notice that your website or website you’re looking for is not indexed you can just launch a cruel job you put this link in it will crawl it will sync and everyone else on the network including the hns search website will have the newest index so like this we have a truly decentralized

(15:43) search experience where it’s also almost impossible as long as we have some functioning nodes to take it down and you can either tap into the public index or you can actually create your own private index i’m thinking if you have a company and you have an intranet like you can actually merge that with the public index but only use it for yourself so if you search something you will only see internal or you will see internal and external things mixed but here really we give full control to the user to create

(16:19) a decentralized truly decentralized search engine where everyone contributes to the the index and where you can basically customize it as you see fit so a desktop client is set to be launched in q4 really aim to to release everything this year and we’re really excited to yeah to bring you hopefully a good decentralized experience there are already other alternatives not handshake based but the goal is really to make it as user-friendly as possible so if you want to customize it you can but if you don’t want to customize it you still don’t

(17:03) have to compromise on privacy and on decentralized decentralization so that would bring me uh to the end uh so yeah i would be open for questions or if we’re already out of time uh thank you for your patience and your interest we got time we got uh i just see two questions one from stephen right here i don’t know if you can see it on screen yeah i mean the machine learning is already integrated so the indexer actually well we have a catalog of categories which we manually set and the indexer actually learns with each category and

(17:44) we actually have a punishment system where we go through and we check right now we check if the categories are correct if not the linda kind of gets punished and gets minus points so for the index for the crawl it’s kind for linda it’s kind of like a game if she makes a mistake she she kind of gets points deducted and like this she’s learning and uh yeah i think that explains it i just saw a comment where it says linda was made with uh miley search no miley search is actually a search engine which just provides the

(18:20) the engine to search so you create the index with miley such as the chains json file and they just provide kind of the the engine to to get stuff out but we are moving away from from miley search because my research is amazing but they offer a search as you type and we just found that for a search engine search as you type is not that desirable because yeah you get just too many results as you type your query so the engine itself like linda is actually what indexes everything and that’s fully handmade by us

(18:59) and the machine learning happens as she crawls the sites she analyzes the content and based on previous experiences she then categorizes them i hope that answers the yes linda will also be open source and then uh dope was also just asking where you can find more information about linda as well we are planning on uh actually creating a linda handypadia page but we’re just not there yet we we have our heads deep in now d-web pulse but we want to really document everything we’re doing because we think that’s also helpful for anyone

(19:42) that wants to customize we’re really working on our docs as well so that when we release the code the docs are are clear and for those who like to tinker with stuff they actually can and maybe what i didn’t say before is each element of hns search like the index or the front end the crawler each element is individually usable so you we have it as one module combined but you can actually just take the crawler and make your own search engine because we believe that for for the d-web to to be truly successful

(20:17) it doesn’t have to be hns search of course it would be amazing but for us it’s more important that people get also into the habit of making new uh making new search engines because search is what really fuels the the internet now but of course for that we also need content that’s one of the biggest problems we see now because most of the stuff we crawl are either d-links or advertisement pages for people selling their domain that’s why i’m also very happy we have this website race and i hope there is

(20:48) a new content which we can crawl but yes let’s get some more content on the d-web all right uh let’s just get this last question and then we’ll uh we’ll wrap it up yeah that’s that’s for sure oscar no problem we there we can we can definitely provide the search uh search bar for uh for people to uh incorporate we had it before with all h and s uh well now of course it broke because with the beta ending doesn’t work anymore but it will definitely be possible and we will provide that so people can actually

(21:24) put our search bar on their website for sure all right well andy thank you for being here talking more about hms search we love hearing about it we love hearing the developments and i i think we can all say that we’re really excited to see how it all turns out by the end of this year i certainly am for sure um but any any last words just thank you very much for this awesome community it’s just a treat to be talking to to all of you on discord if you have questions just hit us up on twitter or in in the discord channels we

(21:58) love hearing your feedback we love your energy and just fuels us to to spend our nights working on on this side project which we really fell in love with and the handshake is just amazing and the community as well so thank you all and thank you for listening all right awesome thank you so much for those uh for those good words uh we’re gonna take a quick little break here and then we’re gonna be going into uh niami with uh stefan xerox stefan so that’s going to be a good one to listen in on so i’ll see you there guys

(22:32) [Music] [Applause] [Music] so [Music] so [Music] kinetic is a blockchain crypto investment firm based in hong kong and puerto rico in 2016 they were the first fund in hong kong and one of the earliest in asia with a portfolio of over 220 companies they were seed investors in such projects as ethereum parity and polka

(23:36) dot solana ftx and of course handshake in name base [Music] founder johann chu was an active investor and supporter of the handshake ecosystem over one hundred thousand domains co-founder of d-web foundation co-founder of handicon and sponsor of the handshake house at miami hack week 2022 [Music] [Applause] [Music] [Applause] [Music] [Applause] foreign [Music] [Applause]

(24:41) [Music] [Music] [Applause] you