ML.DATASETS.MAKE_CIRCLES¶

Generates a synthetic dataset of two concentric circles — a classic non-linearly-separable shape.

Syntax¶

ML.DATASETS.MAKE_CIRCLES(n_samples, noise, factor, random_state)

Arguments¶

Name	Type	Default	Description
n_samples	int	100	Total number of points generated. Split evenly between the inner and outer circles.
noise	float	None	Standard deviation of Gaussian noise added to each point. Use 0 for clean circles.
factor	float	0.8	Scale factor between the inner and outer circle (0 < factor < 1). Smaller = inner circle further from outer.
random_state	int	None	Random seed for reproducible output.

Returns¶

Dataset

A Dataset (DataFrame) with columns [x_0, x_1, target], where target is 0 for the outer circle and 1 for the inner.

When to use¶

Use ML.DATASETS.MAKE_CIRCLES to generate a synthetic 2-D dataset of two concentric circles — the textbook shape that linear classifiers and linear PCA cannot separate, and where a Kernel PCA with the right kernel dramatically can.

A common use is to demonstrate when the kernel trick matters: side by side, plot the raw circles, the projection produced by linear PCA, and the projection produced by ML.DIM_REDUCTION.KERNEL_PCA(kernel="rbf"). Only the third visibly separates the two classes.

Examples¶

Generate 150 noisy circles with the inner circle at 30% of the outer radius and a fixed seed:

=ML.DATASETS.MAKE_CIRCLES(150, 0.05, 0.3, 0)

Cell B4 now holds a [Database Icon] Dataset reference. Preview the first few rows with ML.DATA.SAMPLE(B4, 5). The columns are x_0, x_1, and target (0 or 1).

Remarks¶

n_samples is the total number of points; sklearn splits it evenly between the inner and outer circle (75 and 75 when you ask for 150).
noise is the standard deviation of Gaussian noise added to each point. Pass 0 (or leave blank) for clean, noise-free circles.
factor is the ratio of the inner circle's radius to the outer circle's, strictly between 0 and 1. Smaller factor means the inner circle is much smaller (and the two classes are more clearly separated by radius); larger factor makes the rings closer together.
random_state controls reproducibility. Pin it to an integer if you need the same dataset every run.
The dataset is shuffled by sklearn's default — points alternate between classes in cell order. Use ML.DATA.QUERY with an ORDER BY target clause to group rows by class for plotting.