Skip to content

Read from Delta with Delta Sharing

PythonDeltaSharingSource

Bases: SourceInterface

The Python Delta Sharing Source is used to read data from a Delta table with Delta Sharing configured, without using Apache Spark.

Example

from rtdip_sdk.pipelines.sources import PythonDeltaSharingSource

python_delta_sharing_source = PythonDeltaSharingSource(
    profile_path="{CREDENTIAL-FILE-LOCATION}",
    share_name="{SHARE-NAME}",
    schema_name="{SCHEMA-NAME}",
    table_name="{TABLE-NAME}"
)

python_delta_sharing_source.read_batch()

Parameters:

Name Type Description Default
profile_path str

Location of the credential file. Can be any URL supported by FSSPEC

required
share_name str

The value of 'share=' for the table

required
schema_name str

The value of 'schema=' for the table

required
table_name str

The value of 'name=' for the table

required
Source code in src/sdk/python/rtdip_sdk/pipelines/sources/python/delta_sharing.py
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
class PythonDeltaSharingSource(SourceInterface):
    """
    The Python Delta Sharing Source is used to read data from a Delta table with Delta Sharing configured, without using Apache Spark.

    Example
    -------
    ```python
    from rtdip_sdk.pipelines.sources import PythonDeltaSharingSource

    python_delta_sharing_source = PythonDeltaSharingSource(
        profile_path="{CREDENTIAL-FILE-LOCATION}",
        share_name="{SHARE-NAME}",
        schema_name="{SCHEMA-NAME}",
        table_name="{TABLE-NAME}"
    )

    python_delta_sharing_source.read_batch()
    ```

    Parameters:
        profile_path (str): Location of the credential file. Can be any URL supported by [FSSPEC](https://filesystem-spec.readthedocs.io/en/latest/index.html){ target="_blank" }
        share_name (str): The value of 'share=' for the table
        schema_name (str): The value of 'schema=' for the table
        table_name (str): The value of 'name=' for the table
    """

    profile_path: str
    share_name: str
    schema_name: str
    table_name: str

    def __init__(
        self, profile_path: str, share_name: str, schema_name: str, table_name: str
    ):
        self.profile_path = profile_path
        self.share_name = share_name
        self.schema_name = schema_name
        self.table_name = table_name

    @staticmethod
    def system_type():
        """
        Attributes:
            SystemType (Environment): Requires PYTHON
        """
        return SystemType.PYTHON

    @staticmethod
    def libraries():
        libraries = Libraries()
        return libraries

    @staticmethod
    def settings() -> dict:
        return {}

    def pre_read_validation(self):
        return True

    def post_read_validation(self):
        return True

    def read_batch(self) -> LazyFrame:
        """
        Reads data from a Delta table with Delta Sharing into a Polars LazyFrame.
        """
        pandas_df = delta_sharing.load_as_pandas(
            f"{self.profile_path}#{self.share_name}.{self.schema_name}.{self.table_name}"
        )
        polars_lazyframe = pl.from_pandas(pandas_df).lazy()
        return polars_lazyframe

    def read_stream(self):
        """
        Raises:
            NotImplementedError: Reading from a Delta table with Delta Sharing using Python is only possible for batch reads.
        """
        raise NotImplementedError(
            "Reading from a Delta table with Delta Sharing using Python is only possible for batch reads."
        )

system_type() staticmethod

Attributes:

Name Type Description
SystemType Environment

Requires PYTHON

Source code in src/sdk/python/rtdip_sdk/pipelines/sources/python/delta_sharing.py
62
63
64
65
66
67
68
@staticmethod
def system_type():
    """
    Attributes:
        SystemType (Environment): Requires PYTHON
    """
    return SystemType.PYTHON

read_batch()

Reads data from a Delta table with Delta Sharing into a Polars LazyFrame.

Source code in src/sdk/python/rtdip_sdk/pipelines/sources/python/delta_sharing.py
85
86
87
88
89
90
91
92
93
def read_batch(self) -> LazyFrame:
    """
    Reads data from a Delta table with Delta Sharing into a Polars LazyFrame.
    """
    pandas_df = delta_sharing.load_as_pandas(
        f"{self.profile_path}#{self.share_name}.{self.schema_name}.{self.table_name}"
    )
    polars_lazyframe = pl.from_pandas(pandas_df).lazy()
    return polars_lazyframe

read_stream()

Raises:

Type Description
NotImplementedError

Reading from a Delta table with Delta Sharing using Python is only possible for batch reads.

Source code in src/sdk/python/rtdip_sdk/pipelines/sources/python/delta_sharing.py
 95
 96
 97
 98
 99
100
101
102
def read_stream(self):
    """
    Raises:
        NotImplementedError: Reading from a Delta table with Delta Sharing using Python is only possible for batch reads.
    """
    raise NotImplementedError(
        "Reading from a Delta table with Delta Sharing using Python is only possible for batch reads."
    )