Skip to content

PYODBC SQL Connector

PYODBC Driver Paths

To use PYODBC SQL Connect you will require the driver path specified below per operating system.

Operating Systems Driver Paths
Windows C:\Program Files\Simba Spark ODBC Driver
MacOS /Library/simba/spark/lib/libsparkodbc_sbu.dylib
Linux 64-bit /opt/simba/spark/lib/64/libsparkodbc_sb64.so
Linux 32-bit /opt/simba/spark/lib/32/libsparkodbc_sb32.so

PYODBCSQLConnection

Bases: ConnectionInterface

PYODBC is an open source python module which allows access to ODBC databases. This allows the user to connect through ODBC to data in azure databricks clusters or sql warehouses.

Uses the databricks API's (2.0) to connect to the sql server.

Parameters:

Name Type Description Default
driver_path str

Driver installed to work with PYODBC

required
server_hostname str

Server hostname for the cluster or SQL Warehouse

required
http_path str

Http path for the cluster or SQL Warehouse

required
access_token str

Azure AD Token

required
Note 1

More fields can be configured here in the connection ie PORT, Schema, etc.

Note 2

When using Unix, Linux or Mac OS brew installation of PYODBC is required for connection.

Source code in src/sdk/python/rtdip_sdk/connectors/odbc/pyodbc_sql_connector.py
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
class PYODBCSQLConnection(ConnectionInterface):
    """
    PYODBC is an open source python module which allows access to ODBC databases.
    This allows the user to connect through ODBC to data in azure databricks clusters or sql warehouses.

    Uses the databricks API's (2.0) to connect to the sql server.

    Args:
        driver_path: Driver installed to work with PYODBC
        server_hostname: Server hostname for the cluster or SQL Warehouse
        http_path: Http path for the cluster or SQL Warehouse
        access_token: Azure AD Token

    Note 1:
        More fields can be configured here in the connection ie PORT, Schema, etc.

    Note 2:
        When using Unix, Linux or Mac OS brew installation of PYODBC is required for connection.
    """

    def __init__(
        self, driver_path: str, server_hostname: str, http_path: str, access_token: str
    ) -> None:
        self.driver_path = driver_path
        self.server_hostname = server_hostname
        self.http_path = http_path
        self.access_token = access_token
        # call auth method
        self.connection = self._connect()
        self.open = True

    def _connect(self):
        """Connects to the endpoint"""
        try:
            return pyodbc.connect(
                "Driver="
                + self.driver_path
                + ";"
                + "HOST="
                + self.server_hostname
                + ";"
                + "PORT=443;"
                + "Schema=default;"
                + "SparkServerType=3;"
                + "AuthMech=11;"
                + "UID=token;"
                + "UserAgentEntry=RTDIP;"
                +
                #'PWD=' + access_token+ ";" +
                "Auth_AccessToken=" + self.access_token + ";"
                "ThriftTransport=2;" + "SSL=1;" + "HTTPPath=" + self.http_path,
                autocommit=True,
            )
        except Exception as e:
            logging.exception("error while connecting to the endpoint")
            raise e

    def close(self) -> None:
        """Closes connection to database."""
        try:
            self.connection.close()
            self.open = False
        except Exception as e:
            logging.exception("error while closing the connection")
            raise e

    def cursor(self) -> object:
        """
        Intiates the cursor and returns it.

        Returns:
          PYODBCSQLCursor: Object to represent a databricks workspace with methods to interact with clusters/jobs.
        """
        try:
            if self.open == False:
                self.connection = self._connect()
            return PYODBCSQLCursor(self.connection.cursor())
        except Exception as e:
            logging.exception("error with cursor object")
            raise e

close()

Closes connection to database.

Source code in src/sdk/python/rtdip_sdk/connectors/odbc/pyodbc_sql_connector.py
79
80
81
82
83
84
85
86
def close(self) -> None:
    """Closes connection to database."""
    try:
        self.connection.close()
        self.open = False
    except Exception as e:
        logging.exception("error while closing the connection")
        raise e

cursor()

Intiates the cursor and returns it.

Returns:

Name Type Description
PYODBCSQLCursor object

Object to represent a databricks workspace with methods to interact with clusters/jobs.

Source code in src/sdk/python/rtdip_sdk/connectors/odbc/pyodbc_sql_connector.py
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
def cursor(self) -> object:
    """
    Intiates the cursor and returns it.

    Returns:
      PYODBCSQLCursor: Object to represent a databricks workspace with methods to interact with clusters/jobs.
    """
    try:
        if self.open == False:
            self.connection = self._connect()
        return PYODBCSQLCursor(self.connection.cursor())
    except Exception as e:
        logging.exception("error with cursor object")
        raise e

PYODBCSQLCursor

Bases: CursorInterface

Object to represent a databricks workspace with methods to interact with clusters/jobs.

Parameters:

Name Type Description Default
cursor object

controls execution of commands on cluster or SQL Warehouse

required
Source code in src/sdk/python/rtdip_sdk/connectors/odbc/pyodbc_sql_connector.py
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
class PYODBCSQLCursor(CursorInterface):
    """
    Object to represent a databricks workspace with methods to interact with clusters/jobs.

    Args:
        cursor: controls execution of commands on cluster or SQL Warehouse
    """

    def __init__(self, cursor: object) -> None:
        self.cursor = cursor

    def execute(self, query: str) -> None:
        """
        Prepares and runs a database query.

        Args:
            query: sql query to execute on the cluster or SQL Warehouse
        """
        try:
            self.cursor.execute(query)

        except Exception as e:
            logging.exception("error while executing the query")
            raise e

    def fetch_all(self) -> list:
        """
        Gets all rows of a query.

        Returns:
            list: list of results
        """
        try:
            result = self.cursor.fetchall()
            cols = [column[0] for column in self.cursor.description]
            result = [list(x) for x in result]
            df = pd.DataFrame(result)
            df.columns = cols
            return df
        except Exception as e:
            logging.exception("error while fetching rows from the query")
            raise e

    def close(self) -> None:
        """Closes the cursor."""
        try:
            self.cursor.close()
        except Exception as e:
            logging.exception("error while closing the cursor")
            raise e

execute(query)

Prepares and runs a database query.

Parameters:

Name Type Description Default
query str

sql query to execute on the cluster or SQL Warehouse

required
Source code in src/sdk/python/rtdip_sdk/connectors/odbc/pyodbc_sql_connector.py
115
116
117
118
119
120
121
122
123
124
125
126
127
def execute(self, query: str) -> None:
    """
    Prepares and runs a database query.

    Args:
        query: sql query to execute on the cluster or SQL Warehouse
    """
    try:
        self.cursor.execute(query)

    except Exception as e:
        logging.exception("error while executing the query")
        raise e

fetch_all()

Gets all rows of a query.

Returns:

Name Type Description
list list

list of results

Source code in src/sdk/python/rtdip_sdk/connectors/odbc/pyodbc_sql_connector.py
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
def fetch_all(self) -> list:
    """
    Gets all rows of a query.

    Returns:
        list: list of results
    """
    try:
        result = self.cursor.fetchall()
        cols = [column[0] for column in self.cursor.description]
        result = [list(x) for x in result]
        df = pd.DataFrame(result)
        df.columns = cols
        return df
    except Exception as e:
        logging.exception("error while fetching rows from the query")
        raise e

close()

Closes the cursor.

Source code in src/sdk/python/rtdip_sdk/connectors/odbc/pyodbc_sql_connector.py
147
148
149
150
151
152
153
def close(self) -> None:
    """Closes the cursor."""
    try:
        self.cursor.close()
    except Exception as e:
        logging.exception("error while closing the cursor")
        raise e