Adding analytics to PDK

Jesse Scott

TL;DR: We are adding anonymous usage reporting to PDK in the next release, very similar to what is in Bolt. PDK will ask you on first use if you want to opt-out. You can also opt-out later by editing a config file or setting an environment variable.

Hello everyone,

The PDK team would like to let you know that the next version of PDK will include some basic usage reporting/analytics code to help us measure overall adoption and better understand the ways users are interacting with PDK.

All reporting is anonymous and we redact anything that could be considered sensitive before it leaves your system.

Furthermore, to help everyone better understand the shape and scale of the Puppet content developer community it is our intent to make aggregate usage data available on a public dashboard in the future.

Below is a draft of the updated PDK documentation that describes what data is collected and reported as well as how to opt out. One thing that the draft documentation currently does not reflect is that you can also opt out by setting the environment variable "PDK_DISABLE_ANALYTICS=true".

Please let us know if you have any questions or concerns.


-- The PDK Team

PDK data collection

PDK collects usage data to help us understand how it's being used and how we can improve it. You can opt out of data collection at any time; see the section below about opting out.

We collect these values for every analytics event:
  • A random non-identifying user ID. This ID is shared with Bolt analytics, if you've installed Bolt and enabled analytics.
  • PDK installation method (package or gem).
  • Version of PDK.
  • Operating system and version.
For every successful command line invocation of PDK, we collect:
  • The PDK command executed, such as "pdk new module" or "pdk validate".
  • Anonymised command options and arguments.
  • The version of Ruby used to execute the PDK command.
  • The output formats for the command.
  • PDK_* environment variables and their values, if set.
  • Whether a template repository, if used, is default or custom — we do not record the path to the template repo itself.
  • If the default template repo is used, we collect events for each file rendered, recording whether the file is unmanaged, deleted, customized, or default. For customized files, we do not record what changed, only that it was changed in the .sync.yml file.
Note: All arguments and non-Boolean option values, except --puppet-version and --pe-version are redacted in our collected data.

Invalid commands are submitted as a distinct analytics events with the arguments and option values redacted.

To see the data PDK collects, add --debug to a command.

We test the analytics calls strictly to ensure that no unexpected data is accidentally passed in.

Opting out of PDK data collection

The first time you run PDK, it asks you if you want to opt out of data collection. To opt out of data collection after that, edit the analytics.yml file, setting the disabled key to true.

The location of this configuration file depends on your operating system and configuration:
  • For most *nix systems, where the $XDG_CONFIG_HOME variable is set: ${XDG_CONFIG_HOME}/puppet/analytics.yml
  • For most macOS systems, where the $XDG_CONFIG_HOME variable is not set: ~/.config/puppet/analytics.yml
  • For Windows: %LOCALAPPDATA%/puppet/analytics.yml