Concerns about the Community Data License Agreement (CDLA) initiative

xkcd on Standards: https://xkcd.com/927/

This week The Linux Foundation announced the new Community Data License Agreement licenses. I have not yet read the licenses themselves, as I’m still digging out from attending yet another amazing All Things Open, but even without knowing the contents of the licenses I have some questions and concerns.

Jim Zemlin, Executive Director of The Linux Foundation, is quoted in the CDLA press release as saying:

“An open data license is essential for the frictionless sharing of the data that powers both critical technologies and societal benefits…”

Well stated! Open licences for software, data, and other creative works are more important than ever for powering the innovation and invention that move us forward as an expanding civil society. They enable collaboration, without which we’d each be stuck in our own silos, continually reinventing the wheel.

What confuses me, however, is why The Linux Foundation decided to go its own direction rather than support and enhance the well established and understood Creative Commons licenses. There is no mention in the press release that other open data license options already exist, let alone whether they may have some shortcomings that the CDLA is intended to address. But those licenses do exist already, and their use is well documented and accepted by institutions worldwide. Does The Linux Foundation object to these licenses in some way that they created their own? Why fracture the data licensing space in this way?

In fact, Andy Updegrove, in a blog post welcoming CDLA into the world, stated that part of the motivation for its creation was specifically to avoid the license proliferation that plagued the early years of free and open source software and that was (and is) admirably curtailed by Open Source Initiative, its Open Source Definition, and its list of OSI-Approved licenses. By creating their own open data licenses instead of throwing their considerable weight behind the existing Creative Commons licenses, isn’t The Linux Foundation contributing to the license proliferation Updegrove says they’re trying to avoid?

That’s one concern I have with this announcement. The other is related to the source and warden of this new initiative.

The Linux Foundation is registered as a 501(c)(6) non-profit organisation. Colloquially known as a c6, these non-profit organisations are formed to promote shared business interests and are beholden to their members (typically businesses or business people themselves) to do just that. In the case of The Linux Foundation, it was originally formed to promote Linux in the business and enterprise space. It has done that brilliantly, and Linux is now the default operating system in servers, automotive, embedded devices, and just about anything else you can think of (save desktops, where it still has a long way to go).

The CDLA, however, claims it is not solely for businesses:

The CDLA licenses are designed to help governments, academic institutions, businesses and other organizations open up and share data, with the goal of creating communities that curate and share data openly.

While I wholeheartedly support the mission of helping governments, academic institutions, and other organisations open up and share data, I believe these institutions would be better served were the initiative brought by an organisation that is accountable to more than simply its members and their business interests. The Linux Foundation undoubtedly has the best intentions here. I do not question that. I do, however, question having the open data of governments and academic institutions being promoted by a group that by definition must dedicate itself to business interests. While undoubtedly The Linux Foundation would not intend to do so, as a c6 there is a risk that the business interests they are pledged to promote might influence the evolution and application of the CDLA licenses.

The optics of the CDLA initiative would be greatly improved were it instead managed by a 501(c)(3) non-profit. c3 non-profits have no accountability to business interests and instead operate solely for the public good (for, admittedly, very large values of both “public” and “good”). For instance, Creative Commons is a c3. Its mission and vision directly support the open sharing of data from government and academic institutions, as well as businesses, independent researchers, and any other data source:

Creative Commons develops, supports, and stewards legal and technical infrastructure that maximizes digital creativity, sharing, and innovation.

Our vision is nothing less than realizing the full potential of the Internet — universal access to research and education, full participation in culture — to drive a new era of development, growth, and productivity.

Again, I find myself confused why The Linux Foundation felt it necessary to proliferate open data licenses in this way. Hopefully more information and statements are forthcoming to explain their fractionalising move.

Open Source: Why Can’t We Just Make It Easy On New Contributors?

As an Open Source advocate I’m frequently asked to which projects I contribute. My answer is typically, “None right now.” Why is that? There are a few reasons but they mostly boil down to:

  • As someone who does not code often (or enjoy it immensely) I do not feel welcome.
  • It’s an opaque process.

For now I’ll set aside the first reason to focus on the second: the process is opaque.

From my point of view, there is absolutely no excuse for this. Wikis, CMSs and cheap hosting all make it abundantly simple to offer plentiful guidance to new contributors, yet somehow that guidance is rarely provided. New contributors must delve and divine and guess and pester in order to figure out where to start. It’s almost as though the community does this intentionally as a sort of rite of passage for new members. “If you can figure this out then you’re smart enough to become One Of Us.” This is incredibly arrogant and exclusionist.

Why can’t we just make it easy on people? Why can’t we spend a few hours writing some documentation and tutorials up front, guiding newbies along the community-accepted path of contribution? Long time contributors may find the concept of spoon-feeding new members offensive but doing so will not only bring in more people to the fold it will also raise the quality of the contributions from the get-go.

To demonstrate how a project can make it easier for new contributors I’m going to use the Perl project. This is not because it’s such a bad example of how to do things (it’s pretty good, really) but because I’m fond of Perl and want to make some suggestions on how to make it even better. Also, I know that the Perl community is generally friendly and open to ideas on improving community involvement. That said, most of the suggestions are more generally applicable to all Open Source projects.

First of all, kudos to Perl for having a dedicated Get Involved link very prominently displayed there at the top of every page of Perl.org! This visibility is a testament to the importance of contributions and contributors to the Perl community. Involvement is key for them and they put it front and center, right where it deserves to be.

However after that things get less obvious. The site is clean and well-designed, which is a pleasant change from many project pages, but this page is essentially just a list of bullet points with no context or meaning. For instance, what would I as a new contributor be expected to do with the Bugs, testing and reviews section? Is this for reporting bugs (yes) or fixing them (not so much) or…? What is meant by “testing?” Since I pay attention to happenings in the Perl community I know that this means automatically reporting test results to the imminently cool and useful CPAN Testers when I install new CPAN modules, but what if I were entirely new to the Perl world? I may think that “testing” implies a lot more hands-on work, perhaps testing new commits to the core language. It’s not obvious and it really should be.

The tagline of this page is Whatever your skill set or level you can get involved… While I believe this must be true, speaking as a mostly-non-coder this page does not make it obvious how I could get involved with the Perl project. Are there simple bugs I could fix (and if so, how)? Maybe documentation that needs writing/editing? How about project management? Ticket review? Marketing? Event organization? Donations? Not only are these subjects a good way to capture the less technical users and supporters of your project, they’re also a brilliant way for the new coding contributors to get a feel for the project and up to speed on the processes it uses.

Here’s a quick and dirty first draft of a version of this page which would be more useful to new contributors:

Whatever your skill set or level you can get involved…

Community

Some say CPAN is the strength and heart of Perl, but they’re wrong. The people are the heart of Perl.

Getting Started

Ready to dive in? Here’s how to do it!

Everyone
  • Contributor FAQ Every project has those questions which come up repeatedly. Collect them and their answers to help remove the barrier to entry.
  • An overview of the processes used for Perl. Link to another page. Key for everyone to know the steps in the process, not only so they’ll know what to expect but also so they’ll know better what language to use when asking for help.
  • Where to go for help. While this should be an element of every bullet point here, it also should be aggregated on a single page for ease of use. Broken down by subject. Include names whenever possible (mentors would be awesome and heroic). New contributors are more likely to ask for help if it’s obvious there’s a real person at the other end.
  • Where to find the documentation. Link to another page. List every form of docs that exist. Keeping all this in one place drastically reduces frustration for new contributors.
  • Where to find the code. Link to another page. Not just where to find it, but how to get a copy and, if you wish to commit, how you’d go about doing that (including expected code style).
Non-Coding
  • Review and rate – the CPAN modules you use.
  • Send automatic reports to CPAN testers every time you install a new module.
  • Review trouble tickets. Link to a page describing the process, requirements and expectations for reviewing project trouble tickets. Ideally will include a link to a list of tickets in need of review.
  • Write or edit documentation. Link to a page describing the process, requirements and expectations for writing or editing project documentation. Ideally will include a link to a list of documentation tickets.
  • Translate documentation. Link to a page describing the process, requirements and expectations for translating project documentation. Ideally will include a link to a list of translation tickets.
  • Organize or Volunteer at an event. Calendar/list of events (locations and dates), assistance required and whom to contact for more information.
  • Donate. Link to a donation page and/or page detailing whom to contact to give a donation.
  • Have another idea how you could help? Contact us! Link to the correct contact information.
CPAN
  • Best Practices for coding for CPAN.
  • How to report a bug for a CPAN module.
  • How to contribute a fix for a CPAN module.
  • How to contribute a new CPAN module.
  • Contacting CPAN module maintainers and what to do if they don’t respond.
Core

Perl6

Want to learn more about Perl6? Have a look over here! Anticipate the question rather than forcing it to be asked or searched for.

Again, this is just a quick pass at a possible Contributors page, but as a non-contributor it’s much more inviting than the current, sparse version. I can easily look at this and see my options and the methods for performing them.

Granted, there is a lot of organization and writing which would need to be done for a page like this to work. Believe me, I know it’s not easy getting all these ducks in a row. That said, I strongly believe that any project which goes through the steps necessary to make it much easier for new contributors to join will find its quality and reputation improving by leaps and bounds. It’s an effort very much worth making.