Multimedia:Initial survey

The main goal of the survey was to get a better understanding of why the respondents used Wikimedia Commons (or why they did not), what activities they engaged in and why; it was part of our preliminary user research. An underlying goal was to see how users reflected upon their own use of Wikimedia Commons, and to compare this subjective self-reflection to objective data. Last, we wanted to identify behavioral patterns based on correlations between goals, activities and levels of activity.

The survey ran during three days and was linked from all Wikimedia websites for all logged-in users. It was available in 20 languages in order to address the multilingual nature of Wikimedia websites and particularly Commons. During this time, 25,150 complete responses to the questionnaire were recorded. We used Pearson's χ² tests to evaluate the independence or correlation of some factors, using a threshold of 0.01%.

Schedule

The goal was to have some data before the Meeting in Paris (6-8 November 2009). The fundraising team needed the sitenotice starting November 2nd. We had a very short window of 1 week (26 October - 1 November), which meant translating the survey in 2-3 days.

Results

Larger versions of the charts are available at the end of this document.

Users

 
Reasons for looking up files in Commons (or not).

First, we asked the users if they used Wikimedia Commons, and for what purpose; if they used it infrequently or not at all, we asked why not ({tab:survey-why-use}). Almost 2/3 (62%) of the respondents declared they used Commons (users), while the others declared they did not (non-users). More than half the users identified their main goal when searching files on Commons as to illustrate an article on Wikipedia or another Wikimedia project[Note 1]; the second goal was to use media files offline. We also investigated the reasons why some respondents did not use Commons; given that the target audience of the survey was logged-in Wikimedia users, we expected them to be aware of Commons, even if they did not participate in it. On the contrary, it appeared that 70% of the respondents who did not use Commons simply did not know about it until the survey. Other reasons included issues due to the search feature or the predominance of English as the lingua franca.

Distribution of the major motivations chosen by respondents when asked why they used Wikimedia Commons
Reason to use Commons Respondents (%)
to illustrate Wikip/media 8611 55%
to use offline 4331 28%
to use online 1426 9%
other 1369 9%
Distribution of the major motivations chosen by respondents when asked why they did not used Wikimedia Commons
Reason not to use Commons Respondents (%)
don't know Commons 6578 70%
other 1235 13%
search issues 893 9%
English issues 707 8%

Motivations to participate

 
Main goals of participants.

The users were then asked if they also participated in the activities of Wikimedia Commons. More than half the users (60% of users, 37% of respondents) declared they did. Baytiyeh & Pfaffman recently studied the motivations of "administrators" volunteering their time on Wikipedia[1]. They found that the main reason was a desire to learn and "an altruistic desire to create a resource for others to use". Our survey was broader and was not intended to be as thorough as Baytiyeh's; however, the fundamental difference between Wikipedia and Wikimedia Commons elicited our curiosity: were the motivations shared universally amongst Wikimedia projects? We asked the participants to identify the main reason why they participated in Wikimedia Commons ({tab:survey-why-participate}). 65% of respondents declared their main motivation was to illustrate Wikipedia or another Wikimedia projects. The altruistic desire to share with the world was consciously identified as main reason to contribute by only 11% of respondents; this proportion reaches 28% when consolidated with similar, but less explicit answers. Overall, contributing to Wikimedia Commons is widely considered by participants as a means to illustrate Wikipedia (or other Wikimedia projects), rather than a goal in itself.

Distribution of the major motivation chosen by the respondents when asked why they participated in Wikimedia Commons.
Reason to participate in Commons Respondents (%)
to illustrate Wikip/media 6045 65%
to share with the world 1001 11%
to keep Commons running 979 10%
to collect free works 689 7%
for fun 408 4%
other 230 2%
 
Main reasons for not participating.

Users who stated they participated only infrequently or not at all were asked why not ({tab:survey-why-not-participate}). Our hypothesis was that the upload process (interface & policies) was too complicated, especially for users who did not speak English. In reality, 2/3 of the non-participants gave a reason related to a lack of awareness or appeal from Wikimedia Commons (they had other priorities, they were not interested or they did not even know they could participate). Only 12% or non-participants mentioned the perceived technical or legal complexity of the upload process as the major reason for not participating.

It must however be emphasized that these reasons are intertwined: some respondents explained that participating in Commons was not a priority because the complexity of the upload process required a lot of time. These results show the need for a large-scale communication and outreach campaign to raise awareness and understanding about Wikimedia Commons. In order to facilitate the perennial recruitment of new participants, we suggest to hold this campaign only after significant improvements have been made to the user experience on Commons.

Distribution of the major reason chosen by the respondents when asked why they did not participate in Wikimedia Commons.
Reason to participate in Commons Respondents (%)
have other priorities 1886 28%
did not know they could 1327 19%
are not interested 933 14%
upload is complex 785 11%
other 583 9%
confusing classification 478 7%
English issues 393 6%

Motivations & levels of participation

In order to make behavioral patterns emerge, we looked for correlations between motivations and levels of participation. A strong correlation can be established between respondents who uploaded between 10 and 100 files, and respondents whose main motivation is to illustrate Wikipedia. On the other hand, these participants care significantly less about the mission (to collect free works) or the functioning (to keep it running) of Wikimedia Commons, and rarely associate it with "having fun". A similar correlation shows for participants who uploaded between 100 and 1000 files, whose main goal is to illustrate Wikipedia and who are significantly less interested in keeping Commons running. Our proposed explanation is that participants in these two groups are mainly "Wikipedians", uploading media files to Commons on an as-needed basis, while writing or improving articles on Wikipedia.

Another correlation exists between respondents who uploaded more than 10,000 files and those who identify "fun" as their main motivation. On the other hand, these highly active "Commoners" care significantly less about illustrating Wikipedia or other Wikimedia websites.

Activities & levels of participation

General analysis

 
Distribution of respondents by amount of files uploaded to Commons.

We asked the participants how many files they had uploaded to Wikimedia Commons, how many edits they had made there and the frequency at which they performed certain tasks. We were surprised to find that half the participants declared they had uploaded less than 10 files to Commons: this means that many Wikimedia participants upload very few media files to Commons, or even none at all. Another third occasional participants uploaded between 10 and 100 files ({tab:survey-files-uploaded}).

Distribution of respondents by number of media files uploaded to Wikimedia Commons.
Files uploaded Respondents (%)
less than 10 4698 50%
10 to 100 3048 33%
100 to 1000 1238 13%
1000 to 10000 273 3%
more than 10000 93 1%

Correlations

There is a strong correlation between the frequency of upload and the number of files uploaded. While it may seems obvious for modest participants, this finding is particularly significant for highly active users, who not only contribute many media files, but do so on a regular basis.

A strong correlation can also be established between the number of uploads and the number of edits. Participants who upload a lot of media files (and thus contribute the most content) are also the most active participants with regard to general maintenance activities.

Last, participants who upload a high number of files are correlated to those who work significantly on categories. Our proposed explanation is that participants who upload many files are experienced users; they are familiar with the policies and guidelines of Commons, particularly the need to classify the content appropriately in order to facilitate its findability by other users.

Language & project of origin

 
Referrer of respondents.

Given the multilingual and central nature of Wikimedia Commons, one of our goals was also to weigh the influence of the language and website of origin. During the survey, we recorded the HTTP referrer of all respondents and consolidated these results by language, using the ISO 639-1 code. Then, we looked for correlation between the language and the participation, in order to identify possible patterns. We integrated the number of articles in each associated language version of Wikipedia, provided by the Special:Statistics page as of May 3rd, 2010. Likelihood to participate in Commons was determined by studying the correlation between participants from each language, and participants who reported they participate in Commons "regularly", "sometimes" or "not at all". For languages not included in these results, no conclusive correlation could be established with a behavioral pattern.

Three groups stood out ({tab:lang-size-participation}). Respondents coming from Wikimedia websites in Arabic, Bulgarian, Catalan, Czech, Danish, Finnish, French, Hebrew, Italian, Dutch, Norwegian, Polish, Swedish and Chinese language are correlated with regular participants in Wikimedia Commons. Users from projects in German, Hungarian and Vietnamese language are in a similar but less pronounced situation. On the other hand, respondents originating from a Wikimedia project in English, Japanese, Portuguese, Russian and Turkish language show the opposite correlation: many of them do not participate in Wikimedia Commons. The size of Wikipedia does not seem to be a decisive factor; we suggest that policies and culture on each wiki are, in fact, mostly responsible for these differences.

Likelihood to participate in Wikimedia Commons depending on the language of origin.
ISO 639-1 Language Articles Participation
en English 3280896 not likely
de German 1062882 likely
fr French 943683 very likely
pl Polish 697703 very likely
it Italian 683962 very likely
ja Japanese 673245 not likely
nl Dutch 600749 very likely
pt Portuguese 567177 not likely
ru Russian 531270 not likely
sv Swedish 355065 very likely
zh Chinese 307064 very likely
no Norwegian 257380 very likely
ca Catalan 240267 very likely
fi Finnish 237528 very likely
cs Czech 162116 very likely
hu Hungarian 160724 likely
tr Turkish 143958 not likely
da Danish 128272 very likely
ar Arabic 126284 very likely
vi Vietnamese 120572 likely
he Hebrew 103943 very likely
bg Bulgarian 97211 very likely

Own works

General analysis

 
Proportion of own works among the respondents' uploads.

From a workflow point of view, it makes a significant difference whether users are uploading works they created themselves ("own works") or works created by other people (either relatives or strangers). If they created the work themselves and they want to share it on a Wikimedia website, they only have to choose a free license compatible with the Wikimedia licensing policy. Uploading someone else's work requires a completely different workflow, involving the possibility to check with third-party copyright holders that they actually gave permission to use their work under the specified free license.

In order to gain a better understanding of these two cases, we asked the users to evaluate the proportion of own works amongst the total number of files they had uploaded. Based on preliminary analysis, we originally defined four slices: less than 10%, from 10% to 50%, from 50% to 90% and more than 90% own works ({tab:survey-own-works}). It appeared the answers were generally very similar for users in the two central groups (between 10% and 90%). The two extreme groups, however, are most significant; 39% of the participants declared they had uploaded more than 90% of own works, thus constituting a group of "Creators" (photographers, illustrators, mapmakers, etc.). 32% of the participants uploaded less than 10% of own works.

Distribution of the proportion of own works according to the respondents.
Own works Respondents Respondents (%)
less than 10% 3010 32%
10% to 50% 1315 14%
50% to 90% 1399 15%
more than 90% 3626 39%

Own works & upload frequency

A strong correlation can be established between the ratio of own works and the frequency of uploads. Participants who uploaded less than 10% of own works are also the ones who rarely upload media files at all. Similarly, there is a strong correlation between participants who uploaded between 10 and 90% of own works and those who upload new files on an occasional or a regular basis. Last, participants who uploaded more than 90% of own works are correlated to the participants who upload new files very often.

Own works & motivations

From an outreach and recruitment point of view, it is of particular interest to analyze the motivations of participants depending on their ratio of own works. For example, our initial hypothesis was that participants who uploaded less than 10% of own works would be "Wikipedians" who occasionally looked for acceptable files to illustrate their article and uploaded them to Commons. In reality, our results show there is a strong correlation between participants who uploaded less than 10% of own works and the participants motivated by fun, the collection of free works and the functioning of Commons. On the contrary, these users show a significantly lower motivation for illustrating Wikipedia. Our proposed explanation is that users who uploaded less than 10% of own works are actually "Gleaners", who look for media files of interest, under an acceptable license, and collect them on Wikimedia Commons.

Similarly, we expected participants who uploaded more than 90% of own works ("Creators") to be mainly motivated by fun or the desire to share their work with the world. In reality, while there is indeed a correlation between these users and the desire to share with the world, a correlation can also be established with users whose main motivation is to illustrate Wikipedia or another Wikimedia project. These users are significantly less motivated by fun or the will to keep Commons running. "Creators" are thus not necessarily "Commoners"; a large part of them are actually "Wikipedians".

Limitations

As with any survey, there is a possibility that the respondents did not understand the intended meaning of the questions. We introduced control questions to identify and limit such misunderstandings. Also, some respondents may have misjudged their actual levels of participation. We provided links to automated tools to help them check these metrics during the questionnaire. Last, the study was originally intended to be targeted at all visitors of Wikimedia Commons, as well as all logged-in users from all Wikimedia websites. A technical glitch prevented us from targeting logged-out visitors; it therefore introduced a Wikimedia-specific bias in the analysis of why the respondents used Wikimedia Commons. This means our results focus on the "active" Wikimedian population, and further research will be necessary to better understand the "passive" users of Wikimedia Commons[Note 2].

Conclusions

This study was realized as part of a research-driven design process aiming to improve the user experience while contributing multimedia content to Wikimedia websites. We showed that the nature of Wikimedia Commons made it unique, different both from popular media sharing platforms and from other Wikimedia websites. The growth of its content has been dramatically faster than the one of its community. New tools and/or processes have to be developed in order to help the community absorb the increasing inflow of new files.

We presented the results of an online user survey, which showed that Commons was not very well known and used, even among Wikimedia participants: half the respondents had uploaded less than 10 files. Commons is considered by many to be only a mean to add a media file to a page on another Wikimedia project, such as an article on Wikipedia. By analyzing the correlations between the answers of the respondents, we identified several types of users, such as "Wikipedians", "Commoners", "Creators" and "Gleaners". Further research is needed to understand the influence of language on a multilingual and multicultural collaborative website like Wikimedia Commons.

As the first study on this topic, and particularly on Wikimedia Commons, our research was exploratory by nature. Our results confirmed some hypotheses we made, but also clearly invalidated others. This conclusion naturally demonstrates the need for a more thorough investigation about the goals, needs and motivations of users contributing multimedia content to Wikimedia websites, or at least willing to to some extent.

The online survey was only one of the venues we used to collect information about users. We also invited them to contribute to an open forum, we interviewed and observed users and we met with stakeholders and subject matter experts. We present the consolidated findings of this additional user research below, as well as the product development process and the design decisions that ensued.

Notes

  1. See Limitations.
  2. "Active" users are generally called "editors", or "participants", and "passive" users "readers". However, in the case of a media library, "readers" does not really apply.

References

  1. Why be a Wikipedian. H. Baytiyeh and J. Pfaffman. In CSCL'09: Proceedings of the 9th international conference on Computer supported collaborative learning, pages 434-443. International Society of the Learning Sciences, 2009.

Large versions of the charts

 
Reasons for looking up files in Commons (or not).
 
Distribution of respondents by amount of files uploaded to Commons.
 
Proportion of own works among the respondents' uploads.
 
Main goals of participants.
 
Main reasons for not participating.
 
Referrer of respondents.