Creative Commons

A Look Back at the 2020 Virtual CC Global Summit

jeudi 12 novembre 2020 à 15:01

1300+ participants | 200+ presenters | 170+ sessions | 60+ countries

CC Global Summit Map — A map showing where participants to the CC Global Summit attended from! Courtesy of Hopin.

The 2020 virtual CC Global Summit exceeded our expectations—over 1300 community members, from Canada and El Salvador to Nigeria and New Zealand, chose to spend a week with us to discuss the future of open, the unknowns of artificial intelligence, the possibilities of open GLAM (galleries, libraries, archives, and museums), the pressing need for copyright reform, the impact of the COVID-19 pandemic, and much more. For the first time ever, the CC Summit was free for all to attend. We also adapted the virtual format to accommodate community members worldwide, with sessions taking place across various time zones and languages.

Facing 2020 at the CC Global Summit

When we began the journey to the 2020 CC Summit back in Fall 2019, we couldn’t have imagined the unique challenges and opportunities this year would bring. The patience, passion, and perseverance displayed by our staff, volunteers, and open community members helped create an event which aimed to, in the words of CC’s Claudio Ruiz, “find a path forward in hope and optimism.”

This year, more than ever before, we wanted the CC Global Summit to be a space that brought people together, nurtured relationships, encouraged collaborations, explored new issues, and provided a safe place for difficult questions. The response to Irene Soria Guzmán’s keynote, “Hacer feminista lo abierto: poniendo nuevos engranes a la cultura libre!” makes us believe we succeeded in that aim. Irene asked participants to look at the open movement through a feminist lens to find new ways of understanding authorship and power, creating bridges across our differences. It was encouraging to see so many community members accept her challenge with grace and enthusiasm.

We also introduced a new session at the 2020 Summit, a global land acknowledgement, where we examined ideas of colonialism, power dynamics, and our own biases as we remixed a version for use in a virtual setting. The end result is a unique visual interpretation of those conversations by artist Sonaksha Iyengar (above).

What’s next?

CC Global Summit Artwork Maro Villar — Credit: Maco Villar (CC-BY).

Over the next week or so, we’ll be publishing all three of the CC Summit keynotes with transcripts to increase accessibility, so stay tuned! While we recorded all 170+ sessions, we plan to first receive permission from the speakers to publicly release these recordings and then create a catalog on the CC Global Summit website of the approved videos. We ask for your patience and understanding during this process, as it will take some time to ensure we respect the privacy of everyone who appeared on video. If you’re eager for video content in the meantime, check out the concert! We also released a campaign featuring the 2020 CC Global Summit artwork by Chilean artist Marco Villar. You can now purchase t-shirts, hoodies, mugs, and tote bags with this year’s artwork, and support Creative Commons at the same time! Want to make your own CC Summit-inspired pieces? Download the artwork here.

Thank you!

We’d like to extend our sincere thanks to everyone who made this event one of our best yet, despite all that’s happened in 2020. This includes the volunteers who wowed us with their energy, responsiveness, and commitment throughout the event, as well as the presenters and performers who made this event a unique and exciting adventure. Each of you gave us the insight and the opportunity to imagine what the open movement could be in the future, and for that, we are incredibly grateful.

The 2020 CC Global Summit also wouldn’t have been possible without our generous sponsors:

CC Global Summit Sponors

As a nonprofit, Creative Commons relies on the generosity of the public to make events, like the CC Global Summit, possible. Every dollar helps us continue to unlock and expand the limits of open, driving innovation, collaboration, and creativity. Please join us in pushing the boundaries of open by making a gift to CC today!

The post A Look Back at the 2020 Virtual CC Global Summit appeared first on Creative Commons.

Creative Commons 2019 Annual Report

jeudi 5 novembre 2020 à 14:24

I am very pleased to share Creative Commons’ 2019 Annual Report.

This report offers an overview of the important work CC did last year across the many domains and subject areas we work in. (Look for CC’s 2020 annual report to be released in early 2021, where we will have lots to share about this year’s accomplishments.)

In 2019, we continued working with major museums to release large collections into the public domain; helped draft the UNESCO OER Recommendation, which facilitates international cooperation around the development and use of freely accessible educational materials; and produced our biggest-ever CC Global Summit community event, which attracted people from all over the world to meet and discuss open access and our digital future. Plus so much more …

Our new and improved report format is designed to better highlight the organization’s accomplishments and impact. We hope you find it enlightening and enjoyable to read.

As the new CEO of Creative Commons, it is very exciting for me to think about all the ways the CC team will build upon this work going forward. This is especially true as we prepare to celebrate the 20th anniversary of Creative Commons in 2021—a monumental anniversary that we are thrilled to have you join us for.

A very sincere thanks to all of Creative Commons’ supporters, community members, friends, and collaborators. We couldn’t do this work without you.

Help us continue to unlock knowledge and creativity for everyone, everywhere—please consider becoming a donor to Creative Commons.

The post Creative Commons 2019 Annual Report appeared first on Creative Commons.

The Linked Commons 2.0: What’s New?

mercredi 4 novembre 2020 à 18:31

This is part of a series of posts introducing the projects built by open source contributors mentored by Creative Commons during Google Summer of Code (GSoC) 2020 and Outreachy. Subham Sahu was one of those contributors and we are grateful for his work on this project.

The CC Catalog data visualization—the Linked Commons 2.0—is a web application which aims to showcase and establish a relationship between the millions of data points of CC-licensed content using graphs. In this blog, I’ll discuss the motivation for this visualization and explore the latest features of the newest edition of the Linked Commons.

Motivation

The number of websites using CC-licensed content is enormous, and snowballing. The CC Catalog collects and stores these millions of data points, and each node (a unit in a data structure) contains information about the URL of the websites and the licenses used. It’s possible to do rigorous data analysis in order to understand fully how these are interconnected and to identify trends, but this would be exclusive to those with a technical background. However, by visualizing the data, it becomes easier to identify broad patterns and trends.

For example, by identifying other websites that are linking to your content, you can try to have a specific outreach program or collaborate with them. In this way out of billions of webpages out there on the web, you can very efficiently focus on the webpages where you are more likely to see an increase in growth.

Latest Features

Let’s look at some of the new features in the Linked Commons 2.0.

Filtering based on the node name

The Linked Commons 2.0 allows users to search for their favorite node and then explore all of that node’s neighbors across the thousands present in the database. We have color-coded the links connecting the neighbors to the root node, as well as the neighbors which are connected to the root node differently. This makes it immaculately easy for users to classify the neighbors into two categories.

A sleek and revamped design

The Linked Commons 2.0 has a sleek design, with a clean and refreshing look along with both a light and dark theme.

The Linked Commons new design

Tools for smooth interaction with the canvas

The Linked Commons 2.0 ships with a few tools that allow the user to zoom in, zoom out, and reset zoom with just one tap. It is especially useful to users who are on touch devices or using a trackpad.

The Linked Commons toolbox

Autocomplete feature

The current database of the Linked Commons 2.0 contains around 240 thousand nodes and 4.14 million links. Unfortunately, some of the node names are uncommon and lengthy. To prevent users from the exhausting work of typing complete node names, this version ships with an autocomplete feature: for every keystroke, node names will appear that correspond with what the user might be looking for.

The Linked Commons autocomplete

What’s next for the Linked Commons?

In the current version, there are some nodes which are very densely connected. For example, the node “Wikipedia” has around 89k nodes and 102k links as neighbours. This number is too big for web browsers to render. Therefore, we need to configure a way to reduce this to a more reasonable number.

During the preprocessing, we dropped a lot of the nodes and removed more than 3 million nodes which didn’t have CC license information. In general, the current version shows only those nodes which are soundly linked with other domains and their licenses information is available. However, to provide a more complete picture of the CC Catalog, the Linked Commons needs additional filtering methods and other tools. These potentially include:

filtering based on Top-Level domain
filtering based on the number of web links associated with a node

Contributing

We plan to continue working on the Linked Commons. You can follow the project development by visiting our GitHub repo. We encourage you to contribute to the Linked Commons, by reporting bugs, suggesting features or by helping us write code. The new Linked Commons makes it easy for anyone to set up the development environment.

The project consists of a dedicated server which powers the filtering by node name and query autocompletion. The frontend is built using ReactJS, for smooth rendering performance. So, it doesn’t matter whether you’re a frontend developer, a backend developer, or a designer: there is some part of the Linked Commons that you can work on and improve. We look forward to seeing you on board with sparkling ideas!

We are extremely proud and grateful for the work done by Subham Sahu throughout his 2020 Google Summer of Code internship. We look forward to his continued contributions to the Linked Commons as a project core committer in the CC Open Source Community!

Please consider supporting Creative Commons’ open source work on GitHub Sponsors.

The post The Linked Commons 2.0: What’s New? appeared first on Creative Commons.

Important Updates to the Creative Commons Catalog

lundi 2 novembre 2020 à 17:22

This is part of a series of posts introducing the projects built by open source contributors mentored by Creative Commons during Google Summer of Code (GSoC) 2020 and Outreachy. K. S. Srinidhi Krishna and Charini Nanayakkara were two of those contributors and we are grateful for their work on this project.

The Creative Commons (CC) Catalog collects and stores metadata about more than 500 million CC-licensed images scattered across the internet so that they can be made accessible to the general public via the CC Search and CC Catalog API tools. Creative Commons uses two approaches to collect metadata about CC-licensed images:

Gather metadata directly from the HTML around the image on the webpage where it appears. We source this HTML from an open-source web-crawling project called CommonCrawl (link).
Pull metadata about an image from a service provided by some image hosts called an API (Application Programming Interface). An API is an interface whereby a computer program can access some hosted data (or metadata) in a consistent and organized manner suitable for developing a pipeline to further process that data.

During our internship, we worked on using the second method described above to expand the number of images indexed in CC Catalog, as well as improve related infrastructure. We refer to a computer program that implements the second method for a given provider as a Provider API Script.

The CC Catalog collects and stores metadata about more than 500 million CC-licensed images scattered across the internet so that they can be made accessible.

Before diving into the specifics of our contributions, it’s worth clarifying precisely what is meant by a Provider API script. The providers are essentially the cultural or other organisations (such as museums and image hosting services) which host CC-licensed images. Some of these organisations provide APIs to allow external entities to access data such as image metadata hosted by them. The Provider API scripts which can be found in the CC Catalog repository are programs which make use of these APIs to automatically retrieve and aggregate metadata about CC-licensed images hosted by different providers.

Integration of new provider API scripts

Newly implemented Provider API scripts are:

Science Museum: The Science Museum collection has around 60,000 CC-licensed images from the Science Museum, Science and Industry Museum, National Science and Media Museum, National Railway Museum and Locomotion.
Statens Museum: Statens Museum for Kunst is Denmark’s leading museum for artwork. This is a new integration and 39115 images have been collected.
Museums Victoria: Museums Victoria in Australia, features collections of zoology, geology, palaeontology, history, indigenous cultures and technology. It has around 140,000 images.
NYPL: New York Public Library is a new integration. As of now, it has around 1296 images.

Grouping images by their source

While the organisations that host CC-licensed images are referred to as providers, there is the possibility that some of these providers obtain those images from third party organisations. We refer to such third party organisations as the image source.

For example, NASA is a source organisation that publishes their images through the provider Flickr (which is an American image and video hosting service) as well as the provider nasa.gov. Previously, when accessing CC-licensed images through the CC Search tool, users were only able to categorize those images based on the provider but not the source. However, for images made available by certain significant providers such as Flickr and the Smithsonian, the images are now grouped by their source organisations rather than the provider.

After this implementation, all images from a single source—which were previously scattered across different providers—are accessible from one place. The following screenshot shows how the images from the Smithsonian Institution (a provider) are now categorized under multiple source organizations (museums and research centers within the Smithsonian) in the CC Search tool.

CC Catalog

Scheduled reingestion of Europeana images

Europeana is home to over 58 million artworks, artefacts, books, films and music from European museums, galleries, libraries, and archives. Data is collected from Europeana daily, but we only pull data about artefacts added within the previous day. This necessitates refreshing the data on a recurring basis.

The idea is that new data should be refreshed more frequently and as the data gets old, refreshing should become less frequent. While developing the strategy, the API key limit (the maximum number of requests that can be made within a given time period) and the maximum collection expected must be kept in mind. Considering these factors, we implemented a strategy that ensures the data in our database is at most six months old while refreshing the most recently uploaded metadata more frequently.

Retaining metadata from an old data source

Creative Commons sometimes generates metadata on the images collected from different third-party sources. When a provider is shifted from one data source to another, the metadata generated for an image is not available in the new data source. Therefore, there is a need to associate the new data with tags corresponding to that image from the old data source. A direct URL match is not possible as the data sources have different image URLs for the same image, so our goal is to match it on the number or identifier that is associated with the URL.

Expiration of outdated images

As explained under a previous section, we update the images obtained from different providers at varying frequencies depending on the magnitude of the image collection of each provider and the age of each image (i.e. newer images are updated more frequently than older images). Even though this update strategy helps to reflect changes to image metadata (such as the number of views per image), information regarding image deletions was not reflected in the database. This resulted in the accumulation of obsolete images in the database and the presenting of broken links to non-existing images through the CC Search and CC Catalog API tools.

However, since we are aware of the update frequency for each provider and their corresponding images, it is possible to determine whether an image is obsolete or not by looking at their last date of update. Therefore, to resolve this issue, an image expiration strategy which is dependent on the last date an image was implemented!

Identification of creator types

CC Search helps users filter CC-licensed images in numerous ways, including filtering results by the “creator.” However, providers often identify “creators” by different names, such as ‘author’, ‘designer’, ‘cartoonist,’ and ‘modeler’. The ambiguity of names used to identify creators is accentuated for the Smithsonian (a provider). Thus, we carefully analysed the image metadata from the Smithsonian and created a collection of various tags that identify the creators. With this implementation, the number of instances where the creator value is missed for Smithsonian images was considerably reduced. We hope this can be done for other providers as well in the future.

We are extremely proud and grateful for the work done by K. S. Srinidhi Krishna and Charini Nanayakkara throughout their 2020 Google Summer of Code internship. We look forward to their continued contributions to the CC Catalog as project core committers in the CC Open Source Community!

Please consider supporting Creative Commons’ open source work on GitHub Sponsors.

The post Important Updates to the Creative Commons Catalog appeared first on Creative Commons.

Say Hello to Our New CC Open Source Website!

lundi 2 novembre 2020 à 16:45

This is part of a series of posts introducing the projects built by open source contributors mentored by Creative Commons during Google Summer of Code (GSoC) 2020 and Outreachy. This post was written by Dhruvi Butti, a 2020 Outreachy intern and a 3rd-year undergrad at IIIT Surat.

“Celebrate endings—for they precede new beginnings.” – Jonathan Lockwood Huie

Around the last week of August, we bid goodbye to our old CC Open Source website and welcomed a new version that is classy, contemporary, and consistent. The months-long process involved redesigning and reimplementing styles for the website, including integrating Vocabulary—a cohesive design library that makes it easier to develop Creative Commons (CC) applications while ensuring a consistently familiar experience.

CC Open Source Website screenshot

Design library! Ah, why?

We aim to have consistency in design and development to create a cohesive brand identity for CC, which will both improve user experience and drive more awareness of CC’s work to donors. In order to do that, we need a common set of design styles and elements to use across all of the websites that make up CC’s online presence (e.g. CC Open Source, CC Global Summit, CC Certificate, etc.). Therefore, we created Vocabulary, which delivers:

Efficiency: Instead of repeatedly building similar components from scratch, design systems (like Vocabulary) enable designers and developers to reuse components and thereby increase efficiency.
Consistency: Design systems introduce a shared set of principles and rules to build web components, making it easier to create consistent experiences across different platforms.
Scale: The increased efficiency and consistency from these systems allow an organization (like Creative Commons) to build faster products at scale.

Vocabulary at your service

Credit for the new user interface of the CC Open Source website goes to Vocabulary. Our design library is an incredibly useful asset that we have been building for quite some time. All of the design elements are imported from the design library. It is flexible, scalable, and most importantly, well structured and documented so anyone can easily comprehend and make good use out of the available components.

Key differences

The level of user experience has been significantly elevated due to the use of Vocabulary. For example, a primary component of the website is guidelines—we have guidelines for contributing, guidelines for how to join a community, guidelines for how to write a blog, and many more. The new website has cleaner and readable guidelines with proper hierarchies. In addition, every piece of information is made accessible using secondary navigation.

Below are screenshots of two of the guidelines pages from the new website.

Below are screenshots of two guidelines pages from the old website. The differences are evident.

Help us perfect it.

We believe in the saying ‘The more the merrier.’ You can contribute to the new website by fixing code issues, as well as finding and reporting bugs. All development of the CC Open Source website is documented in this GitHub repository. Please remember to read the contributing section in the repo’s README. Issues marked with the green “help wanted” tag is open to contributors. We look forward to working with you and hearing your feedback!

We are extremely proud and grateful for the work done by Dhruvi Butti throughout her 2020 Outreachy internship. We look forward to her continued contributions to CC Open Source as a project core committer in the CC Open Source Community!

Please consider supporting Creative Commons’ open source work on GitHub Sponsors.

The post Say Hello to Our New CC Open Source Website! appeared first on Creative Commons.