Open Database Connectivity
In computing, Open Database Connectivity (ODBC) is a standard application programming interface (API) for accessing database management systems (DBMS). The designers of ODBC aimed to make it independent of database systems and operating systems. An application written using ODBC can be ported to other platforms, both on the client and server side, with few changes to the data access code.
ODBC accomplishes DBMS independence by using an ODBC driver as a translation layer between the application and the DBMS. The application uses ODBC functions through an ODBC driver manager with which it is linked, and the driver passes the query to the DBMS. An ODBC driver can be thought of as analogous to a printer driver or other driver, providing a standard set of functions for the application to use, and implementing DBMS-specific functionality. An application that can use ODBC is referred to as "ODBC-compliant". Any ODBC-compliant application can access any DBMS for which a driver is installed. Drivers exist for all major DBMSs, many other data sources like address book systems and Microsoft Excel, and even for text or comma-separated values (CSV) files.
ODBC was originally developed by Microsoft and Simba Technologies during the early 1990s, and became the basis for the Call Level Interface (CLI) standardized by SQL Access Group in the Unix and mainframe field. ODBC retained several features that were removed as part of the CLI effort. Full ODBC was later ported back to those platforms, and became a de facto standard considerably better known than CLI. The CLI remains similar to ODBC, and applications can be ported from one platform to the other with few changes.
History
Before ODBC
The introduction of the mainframe-based relational database during the 1970s led to a proliferation of data access methods. Generally these systems operated together with a simple command processor that allowed users to type in English-like commands, and receive output. The best-known examples are SQL from IBM and QUEL from the Ingres project. These systems may or may not allow other applications to access the data directly, and those that did use a wide variety of methodologies. The introduction of SQL aimed to solve the problem of language standardization, although substantial differences in implementation remained.
Since the SQL language had only rudimentary programming features, users often wanted to use SQL within a program written in another language, say Fortran or C. This led to the concept of Embedded SQL, which allowed SQL code to be embedded within another language. For instance, a SQL statement like SELECT * FROM city
could be inserted as text within C source code, and during compiling it would be converted into a custom format that directly called a function within a library that would pass the statement into the SQL system. Results returned from the statements would be interpreted back into C data formats like char *
using similar library code.
There were several problems with the Embedded SQL approach. Like the different varieties of SQL, the Embedded SQLs that used them varied widely, not only from platform to platform, but even across languages on one platform – a system that allowed calls into IBM Db2 would look very different from one that called into their own SQL/DS. Another key problem to the Embedded SQL concept was that the SQL code could only be changed in the program's source code, so that even small changes to the query required considerable programmer effort to modify. The SQL market referred to this as static SQL, versus dynamic SQL which could be changed at any time, like the command-line interfaces that shipped with almost all SQL systems, or a programming interface that left the SQL as plain text until it was called. Dynamic SQL systems became a major focus for SQL vendors during the 1980s.
Older mainframe databases, and the newer microcomputer based systems that were based on them, generally did not have a SQL-like command processor between the user and the database engine. Instead, the data was accessed directly by the program – a programming library in the case of large mainframe systems, or a command line interface or interactive forms system in the case of dBASE and similar applications. Data from dBASE could not generally be accessed directly by other programs running on the machine. Those programs may be given a way to access this data, often through libraries, but it would not work with any other database engine, or even different databases in the same engine. In effect, all such systems were static, which presented considerable problems.
Early efforts
By the mid-1980s the rapid improvement in microcomputers, and especially the introduction of the graphical user interface and data-rich application programs like Lotus 1-2-3 led to an increasing interest in using personal computers as the client-side platform of choice in client–server computing. Under this model, large mainframes and minicomputers would be used primarily to serve up data over local area networks to microcomputers that would interpret, display and manipulate that data. For this model to work, a data access standard was a requirement – in the mainframe field it was highly likely that all of the computers in a shop were from one vendor and clients were computer terminals talking directly to them, but in the micro field there was no such standardization and any client might access any server using any networking system.
By the late 1980s there were several efforts underway to provide an abstraction layer for this purpose. Some of these were mainframe related, designed to allow programs running on those machines to translate between the variety of SQL's and provide a single common interface which could then be called by other mainframe or microcomputer programs. These solutions included IBM's Distributed Relational Database Architecture (DRDA) and Apple Computer's Data Access Language. Much more common, however, were systems that ran entirely on microcomputers, including a complete protocol stack that included any required networking or file translation support.
One of the early examples of such a system was Lotus Development's DataLens, initially known as Blueprint. Blueprint, developed for 1-2-3, supported a variety of data sources, including SQL/DS, DB2, FOCUS and a variety of similar mainframe systems, as well as microcomputer systems like dBase and the early Microsoft/Ashton-Tate efforts that would eventually develop into Microsoft SQL Server.[1] Unlike the later ODBC, Blueprint was a purely code-based system, lacking anything approximating a command language like SQL. Instead, programmers used data structures to store the query information, constructing a query by linking many of these structures together. Lotus referred to these compound structures as query trees.[2]
Around the same time, an industry team including members from Sybase (Tom Haggin), Tandem Computers (Jim Gray & Rao Yendluri) and Microsoft (Kyle G) were working on a standardized dynamic SQL concept. Much of the system was based on Sybase's DB-Library system, with the Sybase-specific sections removed and several additions to support other platforms.[3] DB-Library was aided by an industry-wide move from library systems that were tightly linked to a specific language, to library systems that were provided by the operating system and required the languages on that platform to conform to its standards. This meant that a single library could be used with (potentially) any programming language on a given platform.
The first draft of the Microsoft Data Access API was published in April 1989, about the same time as Lotus' announcement of Blueprint.[4] In spite of Blueprint's great lead – it was running when MSDA was still a paper project – Lotus eventually joined the MSDA efforts as it became clear that SQL would become the de facto database standard.[2] After considerable industry input, in the summer of 1989 the standard became SQL Connectivity (SQLC).[5]
SAG and CLI
In 1988 several vendors, mostly from the Unix and database communities, formed the SQL Access Group (SAG) in an effort to produce a single basic standard for the SQL language. At the first meeting there was considerable debate over whether or not the effort should work solely on the SQL language itself, or attempt a wider standardization which included a dynamic SQL language-embedding system as well, what they called a Call Level Interface (CLI).[6] While attending the meeting with an early draft of what was then still known as MS Data Access, Kyle Geiger of Microsoft invited Jeff Balboni and Larry Barnes of Digital Equipment Corporation (DEC) to join the SQLC meetings as well. SQLC was a potential solution to the call for the CLI, which was being led by DEC.
The new SQLC "gang of four", MS, Tandem, DEC and Sybase, brought an updated version of SQLC to the next SAG meeting in June 1990.[7] The SAG responded by opening the standard effort to any competing design, but of the many proposals, only Oracle Corp had a system that presented serious competition. In the end, SQLC won the votes and became the draft standard, but only after large portions of the API were removed – the standards document was trimmed from 120 pages to 50 during this time. It was also during this period that the name Call Level Interface was formally adopted.[7] In 1995 SQL/CLI became part of the international SQL standard, ISO/IEC 9075-3.[8] The SAG itself was taken over by the X/Open group in 1996, and, over time, became part of The Open Group's Common Application Environment.
MS continued working with the original SQLC standard, retaining many of the advanced features that were removed from the CLI version. These included features like scrollable cursors, and metadata information queries. The commands in the API were split into groups; the Core group was identical to the CLI, the Level 1 extensions were commands that would be easy to implement in drivers, while Level 2 commands contained the more advanced features like cursors. A proposed standard was released in December 1991, and industry input was gathered and worked into the system through 1992, resulting in yet another name change to ODBC.[9]
JET and ODBC
During this time, Microsoft was in the midst of developing their Jet database system. Jet combined three primary subsystems; an ISAM-based database engine (also named Jet, confusingly), a C-based interface allowing applications to access that data, and a selection of driver dynamic-link libraries (DLL) that allowed the same C interface to redirect input and output to other ISAM-based databases, like Paradox and xBase. Jet allowed using one set of calls to access common microcomputer databases in a fashion similar to Blueprint, by then renamed DataLens. However, Jet did not use SQL; like DataLens, the interface was in C and consisted of data structures and function calls.
The SAG standardization efforts presented an opportunity for Microsoft to adapt their Jet system to the new CLI standard. This would not only make Windows a premier platform for CLI development, but also allow users to use SQL to access both Jet and other databases as well. What was missing was the SQL parser that could convert those calls from their text form into the C-interface used in Jet. To solve this, MS partnered with PageAhead Software to use their existing query processor, SIMBA. SIMBA was used as a parser above Jet's C library, turning Jet into an SQL database. And because Jet could forward those C-based calls to other databases, this also allowed SIMBA to query other systems. Microsoft included drivers for Excel to turn its spreadsheet documents into SQL-accessible database tables.[10]
Release and continued development
ODBC 1.0 was released in September 1992.[11] At the time, there was little direct support for SQL databases (versus ISAM), and early drivers were noted for poor performance. Some of this was unavoidable due to the path that the calls took through the Jet-based stack; ODBC calls to SQL databases were first converted from Simba Technologies's SQL dialect to Jet's internal C-based format, then passed to a driver for conversion back into SQL calls for the database. Digital Equipment and Oracle both contracted Simba Technologies to develop drivers for their databases as well.[12]
Circa 1993, OpenLink Software shipped one of the first independently developed third-party ODBC drivers, for the PROGRESS DBMS,[13] and soon followed with their UDBC (a cross-platform API equivalent of ODBC and the SAG/CLI) SDK and associated drivers for PROGRESS, Sybase, Oracle, and other DBMS, for use on Unix-like OS (AIX, HP-UX, Solaris, Linux, etc.), VMS, Windows NT, OS/2, and other OS.[14]
Meanwhile, the CLI standard effort dragged on, and it was not until March 1995 that the definitive version was finalized. By then, Microsoft had already granted Visigenic Software a source code license to develop ODBC on non-Windows platforms. Visigenic ported ODBC to the classic Mac OS, and a wide variety of Unix platforms, where ODBC quickly became the de facto standard.[15] "Real" CLI is rare today. The two systems remain similar, and many applications can be ported from ODBC to CLI with few or no changes.[16]
Over time, database vendors took over the driver interfaces and provided direct links to their products. Skipping the intermediate conversions to and from Jet or similar wrappers often resulted in higher performance. However, by then Microsoft had changed focus to their OLE DB[17] concept (recently reinstated [18]), which provided direct access to a wider variety of data sources from address books to text files. Several new systems followed which further turned their attention from ODBC, including ActiveX Data Objects (ADO) and ADO.net, which interacted more or less with ODBC over their lifetimes.
As Microsoft turned its attention away from working directly on ODBC, the Unix field was increasingly embracing it. This was propelled by two changes within the market, the introduction of graphical user interfaces (GUIs) like GNOME that provided a need to access these sources in non-text form, and the emergence of open software database systems like PostgreSQL and MySQL, initially under Unix. The later adoption of ODBC by Apple for using the standard Unix-side iODBC package Mac OS X 10.2 (Jaguar)[19] (which OpenLink Software had been independently providing for Mac OS X 10.0 and even Mac OS 9 since 2001[20]) further cemented ODBC as the standard for cross-platform data access.
Sun Microsystems used the ODBC system as the basis for their own open standard, Java Database Connectivity (JDBC). In most ways, JDBC can be considered a version of ODBC for the programming language Java instead of C. JDBC-to-ODBC bridges allow Java-based programs to access data sources through ODBC drivers on platforms lacking a native JDBC driver, although these are now relatively rare. Inversely, ODBC-to-JDBC bridges allow C-based programs to access data sources through JDBC drivers on platforms or from databases lacking suitable ODBC drivers.
ODBC today
ODBC remains in wide use today, with drivers available for most platforms and most databases. It is not uncommon to find ODBC drivers for database engines that are meant to be embedded, like SQLite, as a way to allow existing tools to act as front-ends to these engines for testing and debugging.[21]
However, the rise of thin client computing using HTML as an intermediate format has reduced the need for ODBC. Many web development platforms contain direct links to target databases – MySQL being very common. In these scenarios, there is no direct client-side access nor multiple client software systems to support; everything goes through the programmer-supplied HTML application. The virtualization that ODBC offers is no longer a strong requirement, and development of ODBC is no longer as active as it once was. But while ODBC is no longer a strong requirement for client-server programming, it is now more important for access, virtualization, and integration in analytics and data science scenarios. These new requirements are reflected in new ODBC 4.0 features such as semi-structured and hierarchical data, web authentication and performance improvement.
ODBC specifications[22]
- 1.0: released in September 1992[23]
- 2.0: c. 1994
- 2.5
- 3.0: c. 1995, John Goodson of Intersolv and Frank Pellow and Paul Cotton of IBM provided significant input to ODBC 3.0[24]
- 3.5: c. 1997
- 3.8: c. 2009, with Windows 7[25]
- 4.0: Development announced June 2016[26] with first implementation with SQL Server 2017 released Sep 2017 and additional desktop drivers late 2018 final spec on Github
Desktop Database Drivers[27]
- 1.0 (1993-08): Used the SIMBA query processor produced by PageAhead Software.
- 2.0 (1994-12): Used with ODBC 2.0.
- 3.0 (1995-10): Supports Windows 95 and Windows NT Workstation or NT Server 3.51. Only 32-bit drivers were included in this release.
- 3.5 (1996-10): Supports double-byte character set (DBCS), and accommodated the use of File data source names (DSNs). The Microsoft Access driver was released in an RISC version for use on Alpha platforms for Windows 95/98 and Windows NT 3.51 and later operating systems.
- 4.0 (late 1998): Support Microsoft Jet Engine Unicode format along with compatibility for ANSI format of earlier versions.
Drivers and Managers
Drivers
ODBC is based on the device driver model, where the driver encapsulates the logic needed to convert a standard set of commands and functions into the specific calls required by the underlying system. For instance, a printer driver presents a standard set of printing commands, the API, to applications using the printing system. Calls made to those APIs are converted by the driver into the format used by the actual hardware, say PostScript or PCL.
In the case of ODBC, the drivers encapsulate many functions that can be broken down into several broad categories. One set of functions is primarily concerned with finding, connecting to and disconnecting from the DBMS that driver talks to. A second set is used to send SQL commands from the ODBC system to the DBMS, converting or interpreting any commands that are not supported internally. For instance, a DBMS that does not support cursors can emulate this functionality in the driver. Finally, another set of commands, mostly used internally, is used to convert data from the DBMS's internal formats to a set of standardized ODBC formats, which are based on the C language formats.
An ODBC driver enables an ODBC-compliant application to use a data source, normally a DBMS. Some non-DBMS drivers exist, for such data sources as CSV files, by implementing a small DBMS inside the driver itself. ODBC drivers exist for most DBMSs, including Oracle, PostgreSQL, MySQL, Microsoft SQL Server (but not for the Compact aka CE edition), Sybase ASE, SAP HANA[28][29] and IBM Db2. Because different technologies have different capabilities, most ODBC drivers do not implement all functionality defined in the ODBC standard. Some drivers offer extra functionality not defined by the standard.
Driver Manager
Device drivers are normally enumerated, set up and managed by a separate Manager layer, which may provide additional functionality. For instance, printing systems often include functionality to provide spooling functionality on top of the drivers, providing print spooling for any supported printer.
In ODBC the Driver Manager (DM) provides these features.[30] The DM can enumerate the installed drivers and present this as a list, often in a GUI-based form.
But more important to the operation of the ODBC system is the DM's concept of a Data Source Name (DSN). DSNs collect additional information needed to connect to a specific data source, versus the DBMS itself. For instance, the same MySQL driver can be used to connect to any MySQL server, but the connection information to connect to a local private server is different from the information needed to connect to an internet-hosted public server. The DSN stores this information in a standardized format, and the DM provides this to the driver during connection requests. The DM also includes functionality to present a list of DSNs using human readable names, and to select them at run-time to connect to different resources.
The DM also includes the ability to save partially complete DSN's, with code and logic to ask the user for any missing information at runtime. For instance, a DSN can be created without a required password. When an ODBC application attempts to connect to the DBMS using this DSN, the system will pause and ask the user to provide the password before continuing. This frees the application developer from having to create this sort of code, as well as having to know which questions to ask. All of this is included in the driver and the DSNs.
Bridging configurations
A bridge is a special kind of driver: a driver that uses another driver-based technology.
ODBC-to-JDBC (ODBC-JDBC) bridges
An ODBC-JDBC bridge consists of an ODBC driver which uses the services of a JDBC driver to connect to a database. This driver translates ODBC function-calls into JDBC method-calls. Programmers usually use such a bridge when they lack an ODBC driver for some database but have access to a JDBC driver. Examples: OpenLink ODBC-JDBC Bridge, SequeLink ODBC-JDBC Bridge.
JDBC-to-ODBC (JDBC-ODBC) bridges
A JDBC-ODBC bridge consists of a JDBC driver which employs an ODBC driver to connect to a target database. This driver translates JDBC method calls into ODBC function calls. Programmers usually use such a bridge when a given database lacks a JDBC driver, but is accessible through an ODBC driver. Sun Microsystems included one such bridge in the JVM, but viewed it as a stop-gap measure while few JDBC drivers existed (The built-in JDBC-ODBC bridge was dropped from the JVM in Java 8[31]). Sun never intended its bridge for production environments, and generally recommended against its use. As of 2008 independent data-access vendors deliver JDBC-ODBC bridges which support current standards for both mechanisms, and which far outperform the JVM built-in. Examples: OpenLink JDBC-ODBC Bridge, SequeLink JDBC-ODBC Bridge.
OLE DB-to-ODBC bridges
An OLE DB-ODBC bridge consists of an OLE DB Provider which uses the services of an ODBC driver to connect to a target database. This provider translates OLE DB method calls into ODBC function calls. Programmers usually use such a bridge when a given database lacks an OLE DB provider, but is accessible through an ODBC driver. Microsoft ships one, MSDASQL.DLL, as part of the MDAC system component bundle, together with other database drivers, to simplify development in COM-aware languages (e.g. Visual Basic). Third parties have also developed such, notably OpenLink Software whose 64-bit OLE DB Provider for ODBC Data Sources filled the gap when Microsoft initially deprecated this bridge for their 64-bit OS.[32] (Microsoft later relented, and 64-bit Windows starting with Windows Server 2008 and Windows Vista SP1 have shipped with a 64-bit version of MSDASQL.) Examples: OpenLink OLEDB-ODBC Bridge, SequeLink OLEDB-ODBC Bridge.
ADO.NET-to-ODBC bridges
An ADO.NET-ODBC bridge consists of an ADO.NET Provider which uses the services of an ODBC driver to connect to a target database. This provider translates ADO.NET method calls into ODBC function calls. Programmers usually use such a bridge when a given database lacks an ADO.NET provider, but is accessible through an ODBC driver. Microsoft ships one as part of the MDAC system component bundle, together with other database drivers, to simplify development in C#. Third parties have also developed such. Examples: OpenLink ADO.NET-ODBC Bridge, SequeLink ADO.NET-ODBC Bridge.
See also
- GNU Data Access
- Java Database Connectivity (JDBC)
- Windows Open Services Architecture
- ODBC Administrator
References
- Bibliography
- Geiger, Kyle (1995). Inside ODBC. Microsoft Press. ISBN 9781556158155.
- Citations
- McGlinn, Evan (1988), Blueprint Lets 1-2-3 Access Outside Data", InfoWorld, vol. 10, no. 14, 4 April 1988, pp. 1, 69
- Geiger 1995, p. 65.
- Geiger 1995, p. 86-87.
- Geiger 1995, p. 56.
- Geiger 1995, p. 106.
- Geiger 1995, p. 165.
- Geiger 1995, p. 186-187.
- ISO/IEC 9075-3 – Information technology – Database languages – SQL – Part 3: Call-Level Interface (SQL/CLI)
- Geiger 1995, p. 203.
- Harindranath, G; Jože Zupančič (2001). New perspectives on information systems development: theory, methods, and practice. Springer. p. 451. ISBN 978-0-306-47251-0. Retrieved 2010-07-28.
The first ODBC drivers […] used the SIMBA query processor, which translated calls into the Microsoft Jet ISAM calls, and dispatched the calls to the appropriate ISAM driver to access the backend […]
- "Linux/UNIX ODBC – What is ODBC?".
- "Our History", Simba Technologies
- Idehen, Kingsley Uyi (October 1994). "ODBC and progress V7.2d". Usenet Newsgroup comp.databases. Retrieved 13 December 2013.
- Idehen, Kingsley Uyi (1995-07-18). "Need ODBC/Ingres driver for DEC OSF/1". Usenet Newsgroup comp.databases.oracle. Retrieved 13 December 2013.
- Sippl, Roger (1996) "SQL Access Group's Call-Level Interface", Dr. Dobbs, 1 February 1996
- "Similarities and differences between ODBC and CLI", InfoSphere Classic documentation, IBM, 26 September 2008
- "OLE DB and SQL Server: History, End-Game, and some Microsoft "dirt"". 25 September 2011.
- "Announcing the new release of OLE DB Driver for SQL Server".
- Anderson, Andrew (2003-06-20). "Open Database Connectivity in Jaguar". O'Reilly MacDevCenter.com. O'Reilly Media, Inc. Retrieved 13 December 2013.
- Sellers, Dennis (2001-07-17). "ODBC SDK update out for Mac OS Classic, Mac OS X". MacWorld. IDG Consumer & SMB. Retrieved 13 December 2013.
- Werner, Christian (2018) "SQLite ODBC Driver" Archived 2014-06-26 at the Wayback Machine, 2018-02-24
- "ODBC Versions". Linux/UNIX ODBC. Easysoft. Retrieved 2009-10-27.
- Antal, Tiberiu Alexandru. "Access to an Oracle database using JDBC" (PDF). Cluj-Napoca: Technical University of Cluj-Napoca. p. 2. Archived from the original (PDF) on 2011-07-22. Retrieved 2009-10-27.
ODBC 1.0 was released in September 1992
- Microsoft Corporation. Microsoft ODBC 3.0 Programmer's Reference and SDK Guide, Volume 1. Microsoft Press. February 1997. (ISBN 9781572315167)
- "What's New in ODBC 3.8". Microsoft. Retrieved 2010-01-13.
Windows 7 includes an updated version of ODBC, ODBC 3.8.
- Rukmangathan, Krishnakumar (2016-06-07). "A new release of ODBC for Modern Data Stores". Microsoft Data Access / SQL BI Technologies Blog. Microsoft. Retrieved 2017-01-03.
After more than 15 years since the last release, Microsoft is looking at updating the Open Data Base Connectivity (ODBC) specification.
- "History of the Desktop Database Drivers".
- "SAP HANA System Properties". DB-Engines. Retrieved 2016-03-28.
- "Connect to SAP HANA via ODBC - SAP HANA Developer Guide for SAP HANA Studio - SAP Library". help.sap.com. Retrieved 2016-03-28.
- Sybase. "Introduction to ODBC". infocenter.sybase.com. Sybase. Retrieved 8 October 2011.
- "Java JDBC API". docs.oracle.com. Retrieved 18 December 2018.
- Microsoft, "Data Access Technologies Road Map", Deprecated MDAC Components, Microsoft "ADO Programmer's Guide" Appendix A: Providers, Microsoft OLE DB Provider for ODBC, retrieved July 30, 2005. Archived 2001 October 5 at the Wayback Machine