ImageNet Graph DataBase
ImageNet project is an image database with 14 million images. This project is arranged more than 33 thousand meaningful and imaginable synset words like “cat”, “table”… by a graphic and hierarchical structure. Then the word related images have been collected for each group. This group words extracted its own words from Word Net Project which is a lexical database for English language.
14 million images have created for more than 33000 word groups of a comprehensive image database to any intelligent agent that are capable of learning visual intelligence can use this database to learn visual intelligence. This project was initiated by Ms. Fife Lee in 2007 at Princeton University in the United States and data was collected by Crowd sourcing from the internet and then was separated.
In Iran we decided to localize the project with the name of “Tasvir Net” in the Research Institute of Cyberspace (Telecommunication Research Center of Iran) located in north AmirAbad, due to limitations of using of Image Net project.
The Image Net project included the following three phases:
- First phase: research on the project
- The second phase is implementation of ImageNet project
- Final phase is to localization and expansion of ImageNet project using Persian data.
This project was run by a team including four members and managing of Dr. Farzad Zargari and I was software architect and designer of the database and site designer of this project.
First phase includes examining the dimensions of the ImageNet project and its feasibility. In this phase the structure of database of ImageNet and its extensibility with existed words in Persian lexicon was examined.
At first it was supposed that data structure would be like a tree. But after a close examination I realized that there are some cases which break down the structure of the tree. For example, rice was both subset of grains and starch. So I realized that the data structure is an oriented graph without cycle.
Failure to pay attention to this parameter has led to a re-review in all the selected languages and technologies for implementation. We originally decided to use PHP and MySQL as a language and database. MySQL was a RDBMS database. The basis of these types of databases is the collections of theories which Mr. Kod expressed them. According to the nature of these types of database, connections are simulated and data are seen as Entity. For this purpose a mapping should be done. As software architect, I examined all mapping states in RDBMS database. Some of the mapping has high reading speed and, instead, their writing speed was low and some vice versa. In this situation, it was necessary to make a decision and because writing and insertion in this project had an annual rate, I focused on speeding up and I used a model that it used a tree measurement for database insertion. For each node we allocated one number on its right and one number on its left. The subtraction of two numbers on right and left and their remained integer was returning the number of their children. So in some operations we had an order equal to one.
However, we must borne heavy loading for data insertion. The problem of words like “rice” in this structure was that according to this model we could not put a node subset of two fathers. So all two-month efforts were ruined and I started again. After sufficient researches I found that RDBMS model can not be a good model for such projects. So I focused NoSQLs. However none of the members of team had any knowledge about this type of database, but I started to study about case. Finally a database named Neo4j was selected with the completely graph nature. This database was defined based on graph theory and unlike RDBMS. In this database communications were defined separately. The language used in this database is named Cypher. Version 2 was used in development. Due to lack of technical knowledge I acquired Cypher language by studying English references and I completely founded ImageNet data on this database.
Extensibility with Persian words
In the second phase of research this question should be answered: how we can extend Persian words by using ImageNet data. FarsNet became the reference for this part of the project. Similar to image net, FarsNet was a localized project of its English-language model with the name of WordNet that it was first performed by Princeton University. FarsNet project that in 2015 two versions of it were done, had many problems that it could not be used. However in ImageNet project it was stipulated that FarsNet data should be used, but unfortunately, in a report that I prepared in that year and sent to group manager, due to its many defects, we could not use of this project as a reference to produce ImageNet project. Therefore it was decided to translate ImageNet words which include 33 word groups and to use them.
In the implementation phase the Neo4j graph database was used as the database and Node.js was used as a back-up site. CSS, HTML and Java script languages were also used as web design part. Implementation includes the following:
- Download 14 million images: about 50% of the images were not accessible due to filtering and an unknown number of images were no longer available. So at first we separated filtered images to download. Then by using Bash Scripting in Ubuntu system we started downloading in data center of the Research Institute of Cyberspace. Another server was used with VPN to continue the process. Written program worked in parallel format.
- Design and writing the Rest API: to connect smart agents we needed API. The APIs should were designed in such a way that they sustained no heavy pressure to server.
- Translation of 33000 word groups: translation was also another problem and for that online translators were used that were translating data automatically.
- Transfer 33000 word groups to database: for this purpose data took cypher format by using Regular Expression to be able to be written in database.
- Amend and find FarsNet nots.
- Connect pictures to the group of words.
Localization by using Persian lexicons
At this point we first realized that many of words are exist in ImageNet project itself. For example “sajadeh” and “Aba” which were related to Islamic culture or concepts such as “noroz” and “Haftseen” were exist in structure of ImageNet data. But a small amount should were added. For example there were some stews that were not in this structure. Therefore, at first these items should be founded and after verifying that there are not existed in the current data disposal by the robot provided by the Parsijo team (national search engine), photos were downloaded from Google and validity of images were checked by crowdsourcing collection method.Eventually, the project was completed after one year of research and development at the Research
Iran Telecommunication Research Center
Oct 2015 - Aug 2016