Sessions | Corresponent Notebook/s |
---|---|
Session 3 | First Script |
Session 4 | Random System |
Session 5 | Search |
Session 6 | Search Visual Engine Classificació |
Session 7 | [Classifier] Cerca Classificació |
Welcome to the visual searcher and classifier BUILDING RECOGNIZER (Terrassa Edition).
About Us
We are 4 students doing a Bachelor's Degree in Engineering of Audio-visual Systems from the "Universitat Politècnica de Catalunya" (UPC), school of Terrassa (ESEIAAT).
Victor Casales Hernández | Mònica Chertó Sarret |
Universitat Politècnica de Catalunya | Universitat Politècnica de Catalunya |
Francisco Ortiz Castillo | Eric Díaz Cívico |
Universitat Politècnica de Catalunya | Universitat Politècnica de Catalunya |
Our Project
We developed a program of visual search based on computer vision. This program allows identifying different emblematic buildings and monuments of an important city, as Terrassa. First of all, we created a database with different categories. This database consists of 12 categories corresponding to 12 different buildings and monuments of the city of Terrassa and also has a 13th category named unknown buildings (random buildings of the same city). The program can identify a photography (input) in two different ways:
- Search Visual Engine (Retrieval): The program will make a ranking with the buildings it thinks the input belong to.
- Image classifier: The program will classify the input in one of the categories of the database, could be one of the 12 known categories or the unknown buildings one. The program has been developed using the library OpenCV and written using the programming language Python.
Developing of the program
This program is a proposal from our teachers of Management and Distribution of Audio-visual Signals. The process of the project consisted in different deliveries, one per week.
Outset
The first week, the database of the project was created and we made our firsts steps with Python and computer vision algorithms, writing a little script that detected the interest points (features descriptors) from an image. We decided to use the SURF algorithm, implemented with the OpenCV library. For the creation of the database, our professors divided the work in different groups. Our group got two categories: "Escola D'Enginyeria" de "Terrassa and Catedral de Terrassa". And we also had to get 50 pics for the unknown buildings category. The pictures bellow are an example of the pics we got for the database:
First Draft (Random)
During the second week of work we created the first draft of our final program which one have two results (explained above): Image classifier and Search Visual Engine. This first draft was made with no complications, just to make ourselves a first idea. For the first draft we used both random values and results. As a consequence, the results were bad.
Search Visual Engine (Retrieval)
In the third week, we worked with one of the main parts of the program: the Search Visual Engine. For this part, a training process of the program was required. The train was done using the k-means algorithm, which is a clustering algorithm that divides the train images (part of the dataset that we used for the training) in different regions using the feature descriptors of the images. Afterwards, the features descriptors of the validation images were classified in each region. Finally, each image had a features vector. From these vectors we could create the rankings of the validation images.
Classifier
The last part of the program is the classifier. The goal of this part was training our program for the category classification of each image. The SVM algorithm was used for this purpose. But we had to take care about the number of images of each category (high number of images in the unknown buildings category than in the other categories). So, to run the SVM algorithm we had to make less emphasis in this category.
Results
Retrieval (MAP) | Classification(Accuracy) | |
Ràndom | 0.016 | 0.022 |
Building Recognizer | 0.30901 | 0.75556 |