pointcloudset.dataset module

class pointcloudset.dataset.Dataset(data: list[dask.delayed.DelayedLeaf] = [], timestamps: list[datetime.datetime] = [], meta: dict = {'orig_file': '', 'topic': ''})

Bases: DatasetCore

Dataset Class which contains multiple pointclouds, timestamps and metadata. For more details on how to use the Dataset Class please refer to the usage.ipynb notebook for an interactive tutorial. The notebook can also be found in the tutorial section of the docu.

classmethod from_file(file_path: Path, **kwargs)

Reads a Dataset from a file.gfile For larger ROS bagfiles files use the commandline tool pointcloudset to convert the ROS file beforehand.

Supported are the native format which is a directore filled with fastparquet frames and ROS bag files (.bag).

Parameters
Returns

Dataset object from file.

Return type

Dataset

Raises
  • ValueError – If file format is not supported.

  • TypeError – If file_path is not a Path object.

Examples

pointcloudset.Dataset.from_file(bag_file, topic="lidar/points", keep_zeros=False)

Examples

pointcloudset.Dataset.from_file(bag_file, topic="lidar/points", keep_zeros=False)
to_file(file_path: Path = PosixPath('.'), **kwargs) None

Writes a Dataset to a file.

Supported is the native format which is a directory full of fastparquet files with meta data.

Parameters
classmethod from_instance(library: str, instance: list[pointcloudset.pointcloud.PointCloud], **kwargs) Dataset

Converts a library instance to a pointcloudset Dataset.

Parameters
  • library (str) –

    Name of the library.

    If “pointclouds”: pointcloudset.io.dataset.pointclouds.dataset_from_pointclouds()

  • instance (list[PointCloud]) – Instance from which to convert.

  • **kwargs – Keyword arguments to pass to func.

Returns

Dataset object derived from the instance.

Return type

Dataset

Raises

ValueError – If instance is not supported.

Examples

pointcloudset.Dataset.from_instance("pointclouds", [pc1, pc2])
apply(func: collections.abc.Callable[[pointcloudset.pointcloud.PointCloud], pointcloudset.pointcloud.PointCloud] | collections.abc.Callable[[pointcloudset.pointcloud.PointCloud], Any], warn: bool = True, **kwargs) pointcloudset.dataset.Dataset | pointcloudset.pipeline.delayed_result.DelayedResult

Applies a function to the dataset. It is also possible to pass keyword arguments.

Parameters
  • func (Union[Callable[[PointCloud], PointCloud], Callable[[PointCloud], Any]]) – Function to apply. If it returns a PointCloud and has the according type hint a new Dataset will be generated.

  • warn (bool) – If True warning if result is not a Dataset, if False warning is turned off.

  • **kwargs – Keyword arguments to pass to func.

Returns

A Dataset if the function returns a PointCloud, otherwise a DelayedResult object which is a tuple of dask delayed objects.

Return type

Union[Dataset, DelayedResult]

Examples

def func(pointcloud:pointcloudset .PointCloud) -> pointcloudset.PointCloud:
    return pointcloud.limit(x,0,1)

dataset.apply(func)
# This results in a new Dataset
def func(pointcloud:pointcloudset.PointCloud) -> float:
    return pointcloud.data.x.max()

dataset.apply(func)
def func(pointcloud:pointcloudset.PointCloud, test: float) -> float:
    return pointcloud.data.x.max() + test

dataset.apply(func, test=10)
property has_original_id: bool

Check if all pointclouds in the Dataset have original_ids

Returns

True if all PointClouds in the the Dataset returns has_original_id.

Return type

bool

agg(agg: str | list | dict, depth: Literal['dataset', 'pointcloud', 'point'] = 'dataset') pandas.core.series.Series | list[pandas.core.frame.DataFrame] | pandas.core.frame.DataFrame

Aggregate using one or more operations over the whole dataset. Similar to pandas.DataFrame.aggregate(). Uses dask.dataframe.DataFrame with parallel processing.

Parameters
  • agg (Union[str, list, dict]) – Function to use for aggregating.

  • depth (Literal["dataset", "pointcloud", "point"], optional) – Aggregation level: “dataset”, “pointcloud” or “point”. Defaults to “dataset”.

Returns

Results of the aggregation. This can be a pandas DataFrame or Series, depending on the depth and aggregation.

Return type

Union[pandas.DataFrame, pandas.DataFrame, pandas.Series]

Raises

ValueError – If depth is not “dataset”, “pointcloud” or “point”.

Examples

dataset.agg("max", "pointcloud")
dataset.agg(["min","max","mean","std"])
dataset.agg({"x" : ["min","max","mean","std"]})
min(depth: str = 'dataset')

Aggregate using min operation over the whole dataset. Similar to pandas.DataFrame.aggregate(). Uses dask.dataframe.DataFrame with parallel processing.

Parameters
  • depth (Literal["dataset", "pointcloud", "point"], optional) – Aggregation level:

  • "dataset"

  • "dataset". ("pointcloud" or "point". Defaults to) –

Returns

Aggregated Dataset.

Return type

Union[pandas.DataFrame, pandas.DataFrame, pandas.Series]

Examples

dataset.min()
dataset.min("pointcloud")
dataset.min("point")

Hint

See also: pointcloudset.dataset.Dataset.agg()

Same as:

dataset.agg(["min"])
max(depth: str = 'dataset')

Aggregate using max operation over the whole dataset. Similar to pandas.DataFrame.aggregate(). Uses dask.dataframe.DataFrame with parallel processing.

Parameters
  • depth (Literal["dataset", "pointcloud", "point"], optional) – Aggregation level:

  • "dataset"

  • "dataset". ("pointcloud" or "point". Defaults to) –

Returns

Aggregated Dataset.

Return type

Union[pandas.DataFrame, pandas.DataFrame, pandas.Series]

Examples

dataset.max()
dataset.max("pointcloud")
dataset.max("point")

Hint

See also: pointcloudset.dataset.Dataset.agg()

Same as:

dataset.agg(["max"])
mean(depth: str = 'dataset')

Aggregate using mean operation over the whole dataset. Similar to pandas.DataFrame.aggregate(). Uses dask.dataframe.DataFrame with parallel processing.

Parameters
  • depth (Literal["dataset", "pointcloud", "point"], optional) – Aggregation level:

  • "dataset"

  • "dataset". ("pointcloud" or "point". Defaults to) –

Returns

Aggregated Dataset.

Return type

Union[pandas.DataFrame, pandas.DataFrame, pandas.Series]

Examples

dataset.mean()
dataset.mean("pointcloud")
dataset.mean("point")

Hint

See also: pointcloudset.dataset.Dataset.agg()

Same as:

dataset.agg(["mean"])
std(depth: str = 'dataset')

Aggregate using std operation over the whole dataset. Similar to pandas.DataFrame.aggregate(). Uses dask.dataframe.DataFrame with parallel processing.

Parameters
  • depth (Literal["dataset", "pointcloud", "point"], optional) – Aggregation level:

  • "dataset"

  • "dataset". ("pointcloud" or "point". Defaults to) –

Returns

Aggregated Dataset.

Return type

Union[pandas.DataFrame, pandas.DataFrame, pandas.Series]

Examples

dataset.std()
dataset.std("pointcloud")
dataset.std("point")

Hint

See also: pointcloudset.dataset.Dataset.agg()

Same as:

dataset.agg(["std"])
extend(dataset: Dataset) Dataset

Extends the dataset by another one.

Parameters

dataset (Dataset) – Dataset to extend another dataset.

Returns

Extended dataset.

Return type

Dataset

animate(**kwargs) Figure

Plot and animate a PointClouds in a dataset as a 3D scatter plot with Plotly. It uses the plot function of PointCloud and bundles them together for an interactive animation pointcloudset.pointcloud.plot().

You can also pass arguments to the Plotly express function plotly.express.scatter_3d().

Parameters

**kwargs – Keyword arguments to pass to plot of a single pointcloud and plotly express.

Returns

The interactive Plotly plot, best used inside a Jupyter Notebook.

Return type

plotly.graph_objs.Figure

Returns

_description_

Return type

go.Figure

Examples

dataset_bag.animate(hover_data=True, color="intensity")