Skip to content

Denormalization

Denormalization

Bases: DataManipulationBaseInterface, InputValidator

Applies the appropriate denormalization method to revert values to their original scale.

Example


```python from rtdip_sdk.pipelines.data_quality.data_manipulation.spark.normalization.denormalization import Denormalization from pyspark.sql import SparkSession from pyspark.sql.dataframe import DataFrame

denormalization = Denormalization(normalized_df, normalization) denormalized_df = denormalization.filter_data() ```

Parameters: df (DataFrame): PySpark DataFrame to be reverted to its original scale. normalization_to_revert (NormalizationBaseClass): An instance of the specific normalization subclass (NormalizationZScore, NormalizationMinMax, NormalizationMean) that was originally used to normalize the data.

Source code in src/sdk/python/rtdip_sdk/pipelines/data_quality/data_manipulation/spark/normalization/denormalization.py
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
class Denormalization(DataManipulationBaseInterface, InputValidator):
    """
     Applies the appropriate denormalization method to revert values to their original scale.

     Example
     --------
     ```python
    from rtdip_sdk.pipelines.data_quality.data_manipulation.spark.normalization.denormalization import Denormalization
     from pyspark.sql import SparkSession
     from pyspark.sql.dataframe import DataFrame

     denormalization = Denormalization(normalized_df, normalization)
     denormalized_df = denormalization.filter_data()
     ```

     Parameters:
         df (DataFrame): PySpark DataFrame to be reverted to its original scale.
         normalization_to_revert (NormalizationBaseClass): An instance of the specific normalization subclass (NormalizationZScore, NormalizationMinMax, NormalizationMean) that was originally used to normalize the data.
    """

    df: PySparkDataFrame
    normalization_to_revert: NormalizationBaseClass

    def __init__(
        self, df: PySparkDataFrame, normalization_to_revert: NormalizationBaseClass
    ) -> None:
        self.df = df
        self.normalization_to_revert = normalization_to_revert

    @staticmethod
    def system_type():
        """
        Attributes:
            SystemType (Environment): Requires PYSPARK
        """
        return SystemType.PYSPARK

    @staticmethod
    def libraries():
        libraries = Libraries()
        return libraries

    @staticmethod
    def settings() -> dict:
        return {}

    def filter_data(self) -> PySparkDataFrame:
        return self.normalization_to_revert.denormalize(self.df)

system_type() staticmethod

Attributes:

Name Type Description
SystemType Environment

Requires PYSPARK

Source code in src/sdk/python/rtdip_sdk/pipelines/data_quality/data_manipulation/spark/normalization/denormalization.py
57
58
59
60
61
62
63
@staticmethod
def system_type():
    """
    Attributes:
        SystemType (Environment): Requires PYSPARK
    """
    return SystemType.PYSPARK