scatterPlotDS {dsBase} | R Documentation |
This function uses two disclosure control methods to generate non-disclosive coordinates that are returned to the client that generates the non-disclosive scatter plots.
scatterPlotDS(x, y, method.indicator, k, noise)
x |
the name of a numeric vector, the x-variable. |
y |
the name of a numeric vector, the y-variable. |
method.indicator |
an integer either 1 or 2. If the user selects the deterministic method in the client side function the method.indicator is set to 1 while if the user selects the probabilistic method this argument is set to 2. |
k |
the number of the nearest neghbours for which their centroid is calculated if the deterministic method is selected. |
noise |
the percentage of the initial variance that is used as the variance of the embedded noise if the probabilistic method is selected. |
If the user chooses the deterministic approach, the function finds the k-1 nearest neighbours of each data point in a 2-dimensional space. The nearest neighbours are the data points with the minimum Euclidean distances from the point of interest. Each point of interest and its k-1 nearest neighbours are then used for the calculation of the coordinates of the centroid of those k points. Centroid here is referred to the centre of mass, i.e. the x-coordinate of the centroid is the average value of the x-coordinates of the k nearest neighbours and the y-coordinate of the centroid is the average of the y-coordinates of the k nearest neighbours. If the user chooses the probabilistic approach, the function adds random noise to $x$ and $y$ separately. Each random noise follows a normal distribution with zero mean and variance equal to 10 disclosure we fix the random number generator in a value that is specified by the input variables. Thus the function returns always the same noisy data for a given pair of variables.
a list with the x and y coordinates of the data to be plot
Demetris Avraam for DataSHIELD Development Team