serotiny.dataframe.transforms.filter module#
- serotiny.dataframe.transforms.filter.filter_columns(input: DataFrame | Dict | Sequence[str], columns: Sequence[str] | None = None, startswith: str | None = None, endswith: str | None = None, contains: str | None = None, excludes: str | None = None, regex: str | None = None)[source]#
Select columns in a dataset, using different filtering options. See serotiny.dataframe.transforms.filter_columns for more details.
- Parameters:
input (Union[pd.DataFrame, Sequence[str]]) – The input to operate on. It can either be a pandas DataFrame, in which case the result is a DataFrame with only the columns that match the filters; or it can be a list of strings, and in that case the result is a list containing only the strings that match the filters
columns (Optional[Sequence[str]] = None) – Explicit list of columns to include. If it is supplied, the remaining filters are ignored
startswith (Optional[str] = None) – A substring the matching columns must start with
endswith (Optional[str] = None) – A substring the matching columns must end with
contains (Optional[str] = None) – A substring the matching columns must contain
excludes (Optional[str] = None) – A substring the matching columns must not contain
regex (Optional[str] = None) – A string containing a regular expression to be matched
- serotiny.dataframe.transforms.filter.filter_rows(dataframe: DataFrame, column: str, values: Sequence, exclude: bool = False)[source]#
Filter a dataframe, keeping only the rows where a given column’s value is contained in a list of values.
- Parameters:
dataframe (pd.DataFrame) – Input dataframe
column (str) – The column to be used for filtering
values (Sequence) – List of values to filter for