Welcome to Web-Siren!

[ Home ] [ Extended SQL syntax ] [ Available data types and extractors ] [ Available datasets ]
[ Try the demo! ] [ Authors ] [ GBDI-ICMC-USP ]

What is Siren?

SIREN - (SI)milarity (R)etrieval (EN)gine - is a command language interpreter that adds similarity query capabilities in SQL. Web-SIREN is a web-based interface aiming exporting SIREN resources for internet access (complete language syntax).

SIREN requires a user identification to allow access. In order to try our demo site, you may use:

  • User: user1
  • Password: user1
  • The current prototype is under development. Althougth it already supports the STILLIMAGE, the AUDIO (as examples of the MONOLITHIC data type) and the PARTICULATE data types, some features of the language are still being implemented (data types and extractors available).

    There are several datasets already loaded at Web-SIREN site. One of them, called Cars (available at http://lib.stat.cmu.edu/), is composed of the description of 392 cars. This dataset is constituted by nine attributes that describe the following variables: MPG (miles per gallon), number of cylinders, engine displacement (cu. inches), horsepower, vehicle weight (lbs.), time to accelerate from 0 to 60 mph (sec.), model year (modulo 100), origin of car (American, European or Japanese) and also the car names.

    Another dataset, called MedImages, is made up by Computerized Tomographies (CT) from three human body parts: abdomen, cranium and thorax. Each tuple of this dataset is constituted by an image id, the image, the description of the body part and an attribute that specifies whether or not the image identifies a pathological condition. There are two similarity measures that can be used to query this dataset by similarity: the first one is the Manhattan (L1) distance function over normalized gray-scale histograms and the second one is based on a texture extractor (description of the available datasets). All information that can identify a patient (such as name, birth date, place and date of the exam) is omitted.

    For the AUDIO data type, none dataset is available due to copyright rules.

    Users can use the CREATE METRIC, and the CREATE INDEX to create his/her own indexes for these data sets.

    Some examples of similarity queries that can be posed over the datasets described above are:

    1. SELECT carname, horsepower, consumption, acceleration, origin
      FROM Cars
      WHERE car near (
        67 as hp,
        38 as mpg,
        15 as sec
        ) STOP AFTER 3

    2. SELECT carname, horsepower, consumption, acceleration, origin
      FROM Cars
      WHERE car NEAR (
        SELECT horsepower AS hp, consumption AS mpg, acceleration AS sec
        FROM Cars
        WHERE carname = 'ford mustang'
        ) STOP AFTER 10
      AND origin <> 'American'

    3. SELECT americancars.carname, europeancars.carname
      FROM americancars, europeancars
      WHERE americancars.car NEAR europeancars.car STOP AFTER 3

    4. SELECT BodyPart, Pathology, Img
      FROM MedImages
      WHERE Img NEAR 'D:\Images\sk_11424_0.jpg'
        BY Texture Range 0.0265

    5. SELECT BodyPart, Pathology, Img
      FROM MedImages
      WHERE Img NEAR (
        SELECT Img
        FROM MedImages
        WHERE Id = 3948
        ) STOP AFTER 5 AND Pathology = 'N'

    One can argue that the first two queries could be solved with a function written in procedural SQL. Please see this example. The problem is that this approach does not allow optimizations, such as the use of indexes, as the function will be executed for every row of the table.

    This site does not allow users to upload data, and therefore it restricts its usage to the images already stored in the core database. If you want to upload an image database in our site, please contact the webmaster for instructions.

    You may try SIREN demo now!

    Disclaimer

    Siren is a prototype that runs an Oracle 10g database and its purpose is evaluation/development only. The available datasets are research examples, and their purpose are to exemplify the language extension proposed.