Distributed Multimedia Database Technologies Supported by MPEG-7 and MPEG-21
| < Day Day Up > |
|
4.5 Multimedia Query Languages, Optimization, and Processing
To retrieve multimedia data, described by a metadata model, a database system must provide a multimedia query language. This language must have capacities to handle queries specifying complex spatial and temporal relationships, [179], [180], [181] keywords, and objective and subjective contents of multimedia objects. In this context, Gudivada et al. [182] defined a set of generic query classes. These classes include:
-
Retrieval by browsing: employed by users who are not familiar with the structure and contents of MMDBMSs. Browsing is also often used for exploratory querying.
-
Querying objective and subjective attributes: An objective attribute query is a structured way of querying that is similar to queries on build-in datatypes used in contexts in which the syntactical form of the query can be specified precisely. This is in contrast to situations where the user submits a fuzzy query formulation.
-
A subjective attribute query is composed of attributes whose interpretation varies considerably from one user to another. For example, a query such as "give me all faces that have a beautiful nose" is subjective, and what "a beautiful nose" resembles is dependent on the taste of the user. Although subjective querying is of obvious interest, at present there is very little research work on this issue.
-
-
CBR queries: allow one to query the low-level characteristics of the media, for example, color, shape, and texture queries for visual media (see Section 4.4).
-
Correlation queries: used to retrieve media that are similar to the query object in terms of
-
Spatial relationship: directional and topological relationships between the corresponding pairs of objects in the query and the database
-
Temporal relationship: relations between the time instances at which the query object and objects in the database appear or disappear
-
Spatio-temporal relationship: a domain phenomenon that varies in space and time
-
There are recent works [183], [184] that propose a framework for semantic integrated information indexing and retrieval by defining a semantic distance function between two conceptual entities in a linguistic ontology. However, these works focus on CBR exclusively. A next version of MPEG-7 will try to identify descriptors that are no longer representations of the fundamental qualities of audiovisual content, but represent subjective content. This obviously requires the notion of a common sense. [185] Somewhat in this direction, the current version of MPEG-7 [186] (see also Chapter 2) introduces the principle of a ControlledTerm Datatype, by which one might control the value of a textual field using a classification scheme; for example, a sport expression can be controlled by the classification scheme provided by the Olympic movement (http://www.olympic.org). In the same manner, but considerably more complex, a commonsense principle might be set up for subjective descriptions.
Moreover, it is desirable to specify the presentation of the results within the query language. For this, several approaches [187], [188], [189], [190], [191] propose a combined multimedia query and presentation language; that is, a query expression consisting of two components: a query language to describe what information to retrieve and a presentation language to specify how to display query results. The advantage of an integrated presentation and query specification is that optimization issues for the presentation part can be addressed effectively with traditional query optimization methods. Adali et al. [192] apply such a principle.
Let us consider as an example the approach of Gonzalez et al. [193] They introduced SQL+D, a multimedia and presentation extension to an objectrelational SQL that gives users the ability to specify, within an SQL query, a screen layout to be used to show the result of a query. For example, the following object-relational table MONUM contains video and audio information about the great world monuments: MONUM (name: string; country: string; a: audio; v: video). If a user wants now to see video clips showing monuments located in the U.S., together with its corresponding audio stream, the content query and presentation specifications can be formulated by the following SQL+D query:
SELECT a,v FROM MONUM WHERE country='USA' DISPLAY panel main WITH a AS audio A, v AS video V ON main.Center(Overlay), SHOW V,A
Exhibit 4.12 shows the principle steps of the SQL+D query processing and the result presentation of the sample query. The SQL+D interpreter receives the query and splits it into the SQL-based content query and the DISPLAY specification parts. Then, the SQL+D interpreter requests a connection to the database, which is obtained by the database interface. The database interface is responsible for starting the content query processing. The DISPLAY specifications and the result of the content query processing are sent as inputs to the display controller, which shows the multimedia display. The result window shows one monument video from the result set. The buttons in the interface allow the navigation through the query result set.
Exhibit 4.12: Structured Query Language+D query processing and result interface.
Note that the use of multimedia data and content required the introduction of new data types and associated methods in SQL+D. As a consequence, we have to employ an extensible DBMS for the processing of these enhanced features. Whether or not extensible DBMSs are the right technical platform for effective multimedia query processing and optimization is discussed in the next paragraph.
4.5.1 Multimedia Query Processing Platforms and Technologies
Object-Relational DBMSs (ORDBMSs), as broadly used extensible DBMS platforms, are actually the most promising technical platforms [194] to implement a multimedia data model and to realize query processing and optimization by providing the following facilities:
-
Providing extensible data types with associated functions and operations and data access methods;
-
Offering extensible indexing, processing, and optimization techniques (e.g., Oracle ORDBMS provides a facility called Data Cartridge Technology that includes the possibility to specify domain indexes and cost models for a newly introduced data type; the MDC proposed in Section 4.6. relies on this technology);
-
Unifying declarative and navigable access on the same level for model and language; and
-
Managing multimedia data externally and internally. Externally, data are stored as files, and the names of these files are stored in the database. With internal management, the fragments are stored as a series of separate objects in the database.
Object-Oriented DBMSs (OODBMSs) provide the above features as well. However, they do not rely on the relational table structure as a storage unit. This might sound like an advantage of OODBMSs at the first glance. It is, however, a disadvantage. Using tables in ORDBMSs allows these systems to use field-tested processing and storage strategies of relational databases, rendering these systems robust in an early introductive phase to the market. Second, tables provide regular access structures and thus enable the use of efficient and simple look-ahead optimization strategies. There are few OODBMSs with multimedia extensions that can compete with respective multimedia-enabled ORDBMSs in robustness and performance. One of them is the ObjectStore, which manages multimedia data; for more information, see http://www.exln.com/products/objectstore/.
Finally, the DISIMA (Distributed Multimedia DBMS) multimedia query processor [195], [196] provides means for complex query processing and optimization of MOQL (Multimedia Object Query Language) queries and is built on top of ObjectStore. Chapter 6 describes the DISIMA system. Current market-leading ORDBMS products provide multimedia packages; that is, collection of data types to support multimedia processing.
IBM Informix packs different data types for images and videos in Data-Blade modules; for example, image processing in the Excalibur Image Data-Blade. [197] Oracle provides the Oracle Visual Information Retrieval system [198] for images, and IBM's DB2 Universal Database proposes multimedia extenders. [199]
4.5.2 Use Case Study: Oracle Visual Information Retrieval System
Let us consider as an example the Oracle Visual Information Retrieval system. This system is an extension of the Oracle 9i ORDBMS, providing image storage, CBIR functionality, and format conversion capabilities through a new object type ORDImage and associated methods and functions. CBIR functionality is provided by the ability to extract an image feature vector from four different visual attributes: global color, local color, texture, and shape. Moreover, Oracle provides a multidimensional index structure ORDImageIndex to accelerate access to the stored feature vectors.
In this system, simple CBR and attribute queries can be handled. However, complex join queries can only be partially treated. The following example shows the only possible formulation of a join between two image tables Pictures1 and Pictures2:
CREATE TABLE Picture1 ( author VARCHAR2(30), description VARCHAR2(200), photo1 ORDSYS.ORDImage, photo1_sig ORDSYS.ORDImageSignature ); CREATE TABLE Picture2 ( mydescription VARCHAR2(200), photo2 ORDSYS.ORDImage, photo2_sig ORDSYS.ORDImageSignature );
These tables are joined based on their feature vectors (called Signatures in Oracle), which are stored in the table element photo1_sig (photo2_sig) of type ORDSYS.ORDImageSignature. The Signatures have first to be generated for comparison with the member function photo1_sig.generateSignature(img). After this generation, the similarity function ORD-SYS.IMGSimilar() is applied, which takes as inputs the two instances of the tables to be joined. The statement results in pairs of instances for which a user-defined threshold similarity value, which governs the difference of the respective feature vectors, has not been exceeded. This means that if the weighted sum of the distances for the visual attributes is less than or equal to the threshold, the images match. The join query expresses then as:
SELECT p1.description, p2.description FROM Picture p1, Picture p2 WHERE ORDSYS.IMGSimilar(p1.photo1_sig, p2.photo2_sig, 'color="0,6" texture="0,2" shape="0,1" location="0,1"', 20)=1;
The SQL/MM standard [200] (Section 4.3) introduces new OR-DB types to handle multimedia data. Notably, the SI_StillImage type is proposed for holding the image and, for example, SI_ColorHistogram and SI_AverageColor, for representing features. In addition to the type concept, several methods have been specified to allow CBR-functionality. For instance, the polymorph SI_Score Method compares two feature vectors. Assume that the object colorhist contains the color histogram of an image Picture1. Then, the method colorhist.SI_Score(Picture2) applied to another image, Picture2, returns a value greater than or equal to 0, with the meaning that the lower the returned value, the closer the color histogram values of Picture1 and Picture2 are.
Let us reconsider the ORDImage query from above and see how it looks in SQL/MM: Let us assume that the tables Pictures1 and Pictures2 contain an element of type SI_StillImage. Their definitions are given below. Note that each feature type used in the table definition must be explicit.
CREATE TABLE Picture1 ( author VARCHAR2(30), description VARCHAR2(200), photo1 SI_StillImage, photo1_color SI_ColorHistogram, photo1_texture SI_Texture, ); CREATE TABLE Picture2 ( mydescription VARCHAR2(200), photo2 SI_StillImage, photo2_color SI_ColorHistogram, photo2_texture SI_Texture, );
Furthermore, let us suppose that we like to join the two tables based on the ColorHistogram and the Texture features, where the difference of the ColorHistogram values should not exceed 0.5 and those of the Texture should not exceed 0.4. The SQL/MM query expresses then as
SELECT p1, p2 FROM Picture1 p1, Picture2 p2 WHERE p1.photo1_color.SI_Score(p2. photo2) > 0.5 AND p1.photo1_texture.SI_Score(p2.photo2) > 0.4
A similar join definition can be found in other systems, for example, the IBM Informix [201] Excalibur Image DataBlade module and prototype implementations and the DISIMA system. [202]
However, it is not possible in these systems to formulate in a single SQL statement a join through the method of the NN search; that is, retain for all tuples of the left-input relation the k-NN in the right-input table. Processing a multimedia join operation through the NN-search method is, however, an alternative and useful form of multimedia join.
In this context, a complete framework for the integration of NN-search supporting operators in an Image DBMS was proposed by us. [203], [204], [205] This framework includes an image data type and associated operators, image algebra, optimization strategies, and finally, appropriate processing strategies. The framework is designed and implemented for an image database relying on the efficient multidimensional index structure of X-trees. [206]
Another important point of concern is the formulation and processing of correlation queries. The above-mentioned multimedia database products, like the DataBlade modules, the Oracle Cartridge ORDImage, and the Multimedia Extenders, as well as the standardized proposition of SQL/MM, do not support correlation queries. For example, these systems are unable to handle image queries like "give me all images where a person stands in front of a car." Lack of such a support might increase the number of undesired matches for CBR queries. For example, if a user wants to find all images showing red automobiles and if each automobile has a person standing in front of it, the color, shape, and position of the person (skin and clothing) will cause color and shape similarities to be detected. This might reduce the importance of color and shape similarities between automobiles because part of the automobile is behind the person and thus not visible. Some MMDBMSs, such as the DISIMA [207] system, overcome this shortcoming. For instance, DISIMA accepts queries in MOQL [208] that extend OQL by adding new predicate expressions, which allow the specification of correlation queries.
4.5.3 Multimedia Query Optimization
In addition to the limited functionality of the proposed query-processing strategies discussed above, effective query optimization mechanisms for multimedia databases are rarely provided. There have been quite a number of proposals of object-oriented and object-relational query algebras, such as the AQUA algebra [209] and the KOLA algebra. [210], [211] Most of them also handle the problem of query optimization in the presence of methods or foreign functions. [212], [213], [214] These later works mainly focus on the appropriate position of the method evaluation within a query execution plan. They are, however, less practical for managing multimedia data and multimedia index structures. This is because of the large data volume introduced by these kinds of media and also because of their complicated access functionality by feature vectors and the complex searching and matching algorithms involved in similarity searching.
There have been some systems introduced that focus on similarity-based query optimization. Adali et al., [215] for example, propose a similarity algebra at a higher abstraction level that integrates heterogeneous similarity measures coming from different similarity implementations into one common framework. For instance, it allows the formulation and optimization of a query returning the union of the best 10 matches from a black-and-white and a color image database. The framework is implemented on top of an integrated search engine. However, it does not provide the implementation level required for processing a multimedia join through the NN search method, nor does it deal with combined multimedia and relational queries.
Ciaccia et al. [216] developed recently the algebra SAMEW. This algebra introduces, in addition to the operators introduced in Adali et al., [217] user preferences such as weights and also captures imprecision in feature representations. However, implementation issues of the complex operations introduced are not addressed.
Possibilities for optimizing complex multimedia expressions were introduced by Stonebraker [218] for simple multimedia join processing. Stone-braker gives some initial optimization ideas. For example, he observes that the traditional select-push rule, which is to push the select operator as close to the base table as possible, might no longer apply to MM query optimization.
Let us imagine a query to be composed of a highly selective, nonmultimedia join between an image table, "Picture," and a traditional table, "Example," and a select operator, "redness(pict)<0.1," on the instances pict of the table "Picture." Assume further that the redness has to be computed on the fly (i.e., no feature vector was stored in the database). Looking then at the processing complexity, one will observe that the computation of the redness of a picture is a computationally intensive operation, which will be better if applied to the result of the join. The reason is that the result of the join produces fewer images than the base table, "Picture," contains.
This example gives a flavor of the complexity of query optimization in multimedia databases. Actually, optimization strategies for complex multimedia processing, including the optimization of compound expressions of relational and multimedia operators, is an open research area.
Finally, it has to be noted that multimedia processing implemented on top of an ORDBMS can deal only with well-defined queries, including the NN-search, where users can specify exactly the nature of their query intentions. However, knowledge of the intentions is sometimes not available, at the level of exactness required by the query interface. Therefore, fuzzy querying is required where the properties of query objects are ambiguous or unclear.
In this context, the CHITRA system uses a fuzzy object query language (FOQL) [219], [220] that is an extension of OQL. Another alternative to a well-defined query statement is the query formulation through query refinement. The Multimedia Analysis and Retrieval System (MARS) [221], [222] system allows complex query formulation by an intelligent query refinement tool. The query processor of the MARS system uses a query expansion approach [223] in which relevant objects are added to a new query representation. However, this approach is user-interaction centric and does not allow the declarative definition of similarity-based joins.
[179]Chen, S.-C., Kashyap, R.L., and Ghafoor, A., Semantic Models for Multimedia Database Searching and Browsing, Kluwer, Dordrecht, 2000.
[180]Weiss, R., Duda, A., and Gifford, D.K., Content-based access to algebraic video, in Proceedings of the International Conference on Multimedia Computing and Systems, Boston, May 1994, pp. 140–151.
[181]Li, J.Z., Goralwalla, I.A., Özsu, M.T., and Szafron, D., Modeling video temporal relationships in an object database management system, Multimedia Comput. Networking, 80–91, 1997.
[182]Gudivada, V.N., Raghavan, R., and Vanapipat, K., A Unified Approach to Data Modelling and Retrieval for a Class of Image Database Applications, Springer-Verlag, Heidelberg, 1996.
[183]Park, Y., Kim, P., Golshani, F., and Panchanathan, S., Concept-based visual information management with large lexical corpus, in Proceedings of the International Conference on Database and Expert Applications (DEXA), Munich, September 2001. Springer-Verlag, Heidelberg, LNCS 2113, pp. 350–359.
[184]Li, W.-S., Selçuk Candan, K., Hirata, K., and Hara, Y., Supporting efficient multimedia database exploration, VLDB J., 9, 312–326, 2001.
[185]Brewer, E.A., When everything is searchable. Comm. ACM, 44, 53–54, 2001.
[186]Martínez, J.M. Overview of the MPEG-7 Standard. ISO/IEC JTC1/SC29/WG11 N4980, (Klagenfurt Meeting), July 2002, http://www.chiariglione.org/mpeg/.
[187]Li, J.Z., Özsu, M.T., Szafron, D., and Oria, V., MOQL: a multimedia object query language, in 3rd International Workshop on Multimedia Information Systems, Como, September 1997, Springer-Verlag, Heidelberg, LNCS Series, pp. 19–28.
[188]Aloia, N., Matera, M., and Paterno, F., Presentations for databases in multimedia environments. Multimedia Syst., 6, 408–420, 1998.
[189]Baral, C., Gonzalez, G., and Nandigam, A., SQL+D: extended display capabilities for multimedia database queries, in ACM International Conference on Multimedia, Bristol, England, September 1998, pp. 109–114.
[190]Marcus, S., Multimedia Database System: Issues and Research Direction, Springer-Verlag, Heidelberg, 1996.
[191]Adiba, M. and Zechinelli-Martin, J.L., Spatio-temporal multimedia presentations as database objects, in International Conference on Database and Expert Systems Applications (DEXA), Florence, September 2000, Springer-Verlag, Heidelberg, LNCS Series 1677, pp. 974–985.
[192]Adali, S., Sapino, M.L., and Subrahmanian, V.S., A multimedia presentation algebra, in Proceedings of the ACM SIGMOD International Conference of Management of Data, Philadelphia, June 1999, pp. 121–132.
[193]Baral, C., Gonzalez, G., and Nandigam, A., SQL+D: extended display capabilities for multimedia database queries, in ACM International Conference on Multimedia, Bristol, England, September 1998, pp. 109–114.
[194]Stonebraker, M., Object-Relational DBMS: The Next Wave, 2nd edition, Morgan Kaufmann, 1998.
[195]Oria, V., Özsu, M.T., Xu, B., Cheng, L.I., and Iglinski, P., VisualMOQL: the DISIMA visual query language, in IEEE International Conference on Multimedia Computing and Systems, Florence, Italy, June 1999, Vol. 1, pp. 536–542.
[196]Oria, V., Özsu, M.T., Iglinski, P., Lin, S., and Ya, B., DISIMA: a distributed and interoperable image database system, in Proceedings of the ACM SIGMOD International Conference of Management of Data, Dallas, May 2000, p. 600.
[197]Excalibur Image DataBlade Module User's Guide, Version 1.2, Informix Press, 1999.
[198]Freeman, R.G., Oracle 9i New Features, McGraw-Hill Osborne Media, 2002.
[199]Chamberlain, D., A complete guide to DB2 Universal Database. Academic Press/Morgan Kaufmann, 1998.
[200]Melton, J. and Eisenberg, A., SQL multimedia and application packages (SQL/MM), SIGMOD Rec., 30(4), 97–102, 2001.
[201]Excalibur Image DataBlade Module User's Guide, Version 1.2, Informix Press, 1999.
[202]Oria, V., Özsu, M.T., Xu, B., Cheng, L.I., and Iglinski, P., VisualMOQL: the DISIMA visual query language, in IEEE International Conference on Multimedia Computing and Systems, Florence, Italy, June 1999, Vol. 1, pp. 536–542.
[203]Atnafu, S., Brunie, L., and Kosch, H., Similarity-based operators and query optimization for multimedia database systems, in Proceedings of the International Database Engineering and Applications Symposium (IDEAS), Grenoble, July 2001, IEEE CS Press, pp. 346–355.
[204]Atnafu, S., Brunie, L., and Kosch, H., Similarity-based operators in image database systems, in Proceedings of the International Conference on Advances in Web-Age Information Management (WAIM), Xi'an, China, July 2001, Springer-Verlag, Heidelberg, LNCS 2118, pp. 14–25.
[205]Kosch, H. and Atnafu, S., A multimedia join by the method of nearest neighbor search. Info. Processing Lett., 82, 269–276, 2002.
[206]Berchtold, S., Keim, D.A., and Kriegel, H.-P., The X-tree: an indexing structure for high-dimensional data, in Proceedings of the VLDB Conference, Bombay, September 1996, pp. 28–39.
[207]Oria, V., Özsu, M.T., Iglinski, P., Lin, S., and Ya, B., DISIMA: a distributed and interoperable image database system, in Proceedings of the ACM SIGMOD International Conference of Management of Data, Dallas, May 2000, p. 600.
[208]Oria, V., Özsu, M.T., Xu, B., Cheng, L.I., and Iglinski, P., VisualMOQL: the DISIMA visual query language, in IEEE International Conference on Multimedia Computing and Systems, Florence, Italy, June 1999, Vol. 1, pp. 536–542.
[209]Leung, T.W., Mitchell, G., Subramanian, B., Vance, B., Vandenberg, S.L., and Zdonik, S.B., The Aqua data model and algebra. Technical Report CS-93-09, Brown University, Providence, 1993.
[210]Cherniack, M. and Zdonik, S.B., Rule languages and internal algebrs for rule-based optimizers, in Proceedings of the ACM SIGMOD International Conference of Management of Data, Montréal, June 1996, pp. 401–412.
[211]Cherniack, M. and Zdonik, S.B., Changing the rules: transformations for rule-based optimizers, in Proceedings of the ACM SIGMOD International Conference of Management of Data, Seattle, June 1998, pp. 61–72.
[212]Hellerstein, J.M. and Stonebraker, M., Predicate migration: optimizing queries with expensive predicates, in Proceedings of the ACM SIGMOD International Conference of Managment of Data, Washington, DC, May 1993, pp. 256–257.
[213]Chaudhuri, S. and Shim, K., Query optimization in the presence of foreign functions, in Proceedings of the International Conference on Very Large Databases, Dublin, 1993, pp. 529–542.
[214]Scheufele, W. and Moerkotte, G., Efficient dynamic programming algorithms for ordering expensive joins and selections, in Proceedings of the International Conference on Extending Database Technology, Valencia, March 1998, pp. 23–27.
[215]Adali, S., Bonatti, P., Sapino, M.L., and Subrahmanian, A.S., A multi-similarity algebra, in Proceedings of the ACM SIGMOD International Conference of Management of Data, Seattle, June 1998, ACM Press, pp. 402–413.
[216]Ciaccia, P., Montesi, D., Penzo, W., and Trombetta, A., Imprecision and user preferences in multimedia queries: a generic algebraic approach, in Foundations of Information and Knowledge Systems, First International Symposium, Burg Spreewald, February 2000. Springer-Verlag, Heidelberg, LNCS 1762, pp. 50–71.
[217]Adali, S., Bonatti, P., Sapino, M.L., and Subrahmanian, A.S., A multi-similarity algebra, in Proceedings of the ACM SIGMOD International Conference of Management of Data, Seattle, June 1998, ACM Press, pp. 402–413.
[218]Stonebraker, M., Object-Relational DBMS: The Next Wave, 2nd edition, Morgan Kaufmann, 1998.
[219]Nepal, S., Ramakrishna, M.V., and Thom, J.A., A fuzzy object query language (FOQL) for image databases, in International Conference on Database Systems for Advanced Applications, Hsinchu, April 1999. IEEE CS Press, pp. 117–124.
[220]Nepal, S., Ramakrishna, M.V., and Thom, J.A., A fuzzy system for content based image retrieval, in Proceedings of the IEEE International Conference on Intelligent Processing Systems, Gold Coast, August 4–7, 1998, pp. 335–339.
[221]Porkaew, K., Ortega, M., and Mehrotra, S., Query reformulation for content based multimedia retrieval in MARS, in IEEE International Conference on Multimedia Computing and Systems, Florence, June 1999, Vol. 2.
[222]Chakrabarti, S., Porkaew, K., and Mehrotra, S., Efficient query refinement in multimedia databases, in Proceedings of the IEEE International Conference on Data Engineering (ICDE), San Diego, February–March 2000, p. 196.
[223]Rui, Y., Huang, T., and Mehrotra, S., Relevance feedback: a powerful tool for contentbased image retrieval. IEEE Trans. Circuits Syst. Video Technol., 8, 25–36, 1998.
| < Day Day Up > |
|