Last week I attended FLOSS 2008, the second international workshop/network meeting on FLOSS (Free/Libre/Open Source software) in Rennes, France. I was presenting my paper Innovation and Imitation with and without Intellectual Property Rights (and would have offered discussant comments but the author of the paper I was scheduled to discuss had to pull out at the last minute). In addition to this I got to hear a variety of interesting talks. On some of these I was able to take notes which I have included below for the ‘delectation’ of anyone else who is interested.

Mikko Valimaki: IPR and Open Source Software

  • Goodman and Myers (2005) — the 3G standard.
  • Leveque and Meniere 2007: what does RAND mean
    • reasonable royalty is R = c (v1-v2)p where c is incremental costs of licensing, v1-v2 is gain from using this pattern over second-best.
  • Other questions for royalty-setting
    • quality of volume of patents
    • early or late innovators
    • cumulative royalties or one-time fees
  • But all models he knows of have non-zero royalty fees
    • [ed]: not surprising given that you will always get interior solutions
  • Windows/Samba discussion
    • specific sets of terms
    • provide RF for the open source community
  • Commission Decision para 783
    • “On balance, the possible negative impact of an order to supply on Microsoft’s incentives to innovate is outweighed by its positive impact on the level of innovation of the whole industry.”
  • Nokia to acquire Symbian:
    • “a full platform will be available … under a royalty-free license … from the Foundation’s first day of operations … the Foundation will make selected components available as open source at launch.”
    • [ed]: Motivation here is clear: Nokia care about the hardware and for them software is a complementary good — which they therefore wish to be as cheap as possible. But this raises question as to what is being made open: is hardware patents or pure software patents (and if so how big a deal is this)

Stefan Koch: Efficiency of FLOSS Production

  • Question of efficiency of open source development
  • How much software did we get for our effort
    • Is OS a waste of resources?
  • Discussion without much empirical basis
    • Claim: fast and cheap, high quality, finding bugs late is inefficient (actually large effort) — see IEEE Software 1999
  • Completely unknown as no-one keeps time-sheets. So
    • Effort based on participation data
    • Effort based on product — look at software and ask how much effort would be needed in commercial environment
  • Empirical research in open source
    • Mainly case studies
    • Helpful but need proper large-scale analysis
  • Mined software repositories [ed: cf. today FLOSSMatrix, FLOSSMore]
    • 8,261 projects
    • 7,734,082 commits
    • 663M LOCs
    • resources and output is skewed: top decile of programmers: 79% of code base, second decile: 11%
  • Effort estimation based on actual participation
    • active programmer months (define active as committing in a given month)
    • high correlation with LOC added in month
  • Cumulate this number for each project
    • But not equal to a commercial person-month
    • How do we scale: use 18.4 h/w taken from stats for committers on Linux kernel
    • [ed:] this is the key assumption. The whole point is that FLOSS effort is not observed and they are using a measure of output (committing) and trying to infer actually activity
  • Manpower function modelling:
    • Norden-Rayleigh model (1960)
    • Some set of problems N (unknown but finite)
    • Probs are solved independently and randomly (following Poisson)
    • This fits ok but has eventual decline in participation which does not occur
    • Modify this: in particular to allow introduction of new problems
      • Introduce in prop to original no. problems, in prop to current set of problems etc
      • Also have different learning rates
      • [ed: but isn’t the setup a little different. Really it is a question of success vs. non-success in terms of acquiring users + some kind of bound on amount of participation due either to fission or complexity]
  • Product-based estimation
    • COCOMO 81 and COCOMO 2
  • Results:
    • Comparison COCOMO - Norden-Rayleigh
    • For COCOMO 81 cannot find parameters favourable enough to explain Norden-Rayleigh curve
    • For COCOMO 2 can find parameters but very favourable
    • Suggest (roughly) that FLOSS very efficient (but not very rigorous)
  • More formal estimation using all models etc
    • Norden-Rayleigh significantly below prodcut-based estimates (factor of 8 in mean)
  • Interpretation
    • FLOSS v. efficient (self-selection for tasks etc)
    • Extremely high amount of non-programmer participation (1:7 relation …)
  • [ed]: not sure about this generous view. Other explanations
    • No quality measurement (also mentioned by Koch)
      • OK: lot of code but low quality
    • (Related) Many sourceforge projects are incomplete, easy bit at the start
      • Later comes a lot of refactoring/writing documentation. This may display significant diminishing returns
    • Many FLOSS projects come from what were originally commercial projects. In that case:
      • code may have already been written
      • conceptual components have been done already
    • Trade-off of time vs. productivity
      • May be more productive to only work 10h a week but then product might not be ready for 10 years
  • Form discussion
    • interesting point: Nokia thinking of moving to more FLOSS in-house because they can’t manage their 5-10k programmers centrally any more

Mickael Vicente: Shift to Competences Model: A Social Network Analysis of Open Source Professional Developers

  • Robles 20007
    • Statistics on Debian showing increasing corporate involvement
  • Social network extraction
    • Get repo logs
    • Create link between 2 developers if they have committed on the same file (non-directed graph)
      • Simplification: the best collaboration of each developer (directed graph) — pick other developer with whom they have committed most files in common
    • Longitudinal analysis
      • extract clusters
  • Correlation with professional career
    • CV collected on Internet, personal web page etc (96% collected)
  • Interesting data

Nicholas Radtke: What Makes FLOSS Projects Successful: An Agent-Based Model of FLOSS Projects

  • Positive Characteristics of FLOSS
    • High quality (Low defect count: Chelf 2006)
    • Rapid development
    • Violates Brooks law (Rossi 2004)
    • Risky Business
  • for every successful FLOSS project there are dozens of unsuccessful projects
  • Corporate IT manager survey (2002)
    • 41% mention inability to hold someone responsible for software
  • Attempts at Simulating FLOSS
    • SimCode (Dalle and David 2004)
    • OSsim (Waggstrom et al 2005)
    • K-Means stuff
  • Simulate across landscape
    • Not social network
    • Focus on developer decision to join/contribute to projects (Agent-Based Modelling)
  • Defining Success and Failure
    • Traditional metrics do not work well (on budget?)
    • Completion (Crowston et al. 2003)
    • Progression through maturity stages (Crowston and Scozzi 2002)
    • Number of developers
    • Mailing list activity
    • Project outdegree, Active developer count (Wang 2007)
  • The Model Universe
    • Agents and projects
    • Agents:
      • Consumption: 0-1
      • Producer: 0-1
      • Resource: 0-1.5 (1=40h)
      • Memory: agents only aware of some subset of projects
      • Needs vector (preferences)
      • utility: linear sum of: similarity match + current popularity (current resources) + cumulative resources + download + f(maturity)
    • Projects:
      • resources needed
      • current resources
      • cumulative resources
      • download count
      • preferences: same as agent but converges towards those had by agents working on it
  • Agents choose between projects each time period
    • have some randomness in that use multinomial logit: prob choose project i ~ exp(mu * Utility of project i)
  • Results
    • Simulate over 250 time steps ~ 4 years
    • calibrate [ed: in a way I was not quite clear about]
    • compare simulation with empirical data from sourceforge
      • developers per project
      • projects per developer
    • Find that (from simulation data) downloads and cumulative resources are not important

Fabio Manenti: Dual Licensing in Open Source Software Markets

  • Benefits of Going Open Source
    • feedback from community
    • network effects (usage)
    • competitive pressures (e.g. Netscape) [ed: not sure this is a benefit]
  • Dual-licensing
    • Kosky (2007): 6% of representative sampl of European OSS business firms employ DL strategies

Alexia Gaudeul: Blogs and the Economics of Reciprocal (In-)Attention

  • What blogs are
  • Reasons for blogging
  • Question: do you befriend (link) because of content produced or do you produce content because of friends
  • General points
    • Market interactions only part of wider class of reciprocal relations
    • Time vs. money economics
    • Unique dataset, very detailed and complete, to test networked relations
  • Model — but left out due to time
  • Dataset: livejournal 2006
    • Sociology: teenagers to young adults (15 to 23), female (67\%), Americans (70\%)
    • Fast growth: created in 1999, 8M accounts, 1.3M active
    • FLOSS but for-profit (SaaS)
    • Great part from self-referential
    • Lively: 4 comments per post on average
    • Federated by communities: no. of communities per person 15
    • Journals updated for more than 2 years on avg
    • 70\% have posted in last 2 months
    • No. of entries: 1 every 2 days
    • No. of friends: 50 avg
    • Balance between friends and friends of
    • Balance between comments received / made
  • Friendship patterns
    • May be balance but does not explain no. of friends of diff. individuals
    • Need to distinguish
      • Norm of reciprocity: more promiscuous bloggers accumulate friends
      • Content attractiveness
        1. Quality/freq. of posts
        2. Interactivity (comments per post)
  • Regressions
    • Reciprocity: No. blogs read (friend) = b * number of readers (friend of) + error
    • Activity: No. readers = cX + error — X = matrix of ind. variables
    • Endogeneity issues [ed: all over the place)
    • Regress: ln(Friends) = ln(Friend of) + … (with instrumenting Friends Of on Activity so solve endogeneity issues)
      • Saturation around 400 friends seemingly (few with more)
    • Max no. of friendship when your no. friends = no. friends of (maybe)
      • A norm of reciprocity
    • Issues with endogeneity of activity (which was used to instrument friends of)

Sylvain Dejean

  • Does ICT lead to the Internet lead to a global village or a cyber-balkan
  • What leads to emergence of virtual commmunities
  • Is the heterogeneity of contributions an impediment to self-organize
  • How to manage virtual communities
  • Agent-based model:
    • Individuals defined by some characteristics
    • Herfindahl index measures degree of self-organization [ed: why self-organization]
    • Communities change via selection and variation

Following up on their commitments in the 2008 Budget (see previous post dealing on publication of ‘Cambridge Study’ today BERR and HMT announced a review of Trading Funds. It will be run by the Shareholder Executive with input from HMT and OPSI. The main task of this review, according to the announcement, is to:

… examine the impact on the trading funds’ business models of any changes to the current pricing, accessing and licensing regimes with the aim of:

  • distinguishing more clearly what information is required by Government for public policy
  • ensuring that this information is available as widely as possible in order to maximise the benefits to the wider UK economy, at a price that balances the provision of such access with the need for users to make a fair contribution to the cost of collecting the information in the long term.

The policy objectives of each of the trading funds will not form part of the assessment, but the review will consider the future of the trading fund model and how it impacts on the delivery of these objectives.

Last year I collated and distilled the notes and summaries accumulated over the PhD into a proper paper which could act as the literature review in my dissertation. While I submitted the PhD last August I’ve only just got around to posting this up and it can now be found at:

http://www.rufuspollock.org/economics/papers/economics_of_knowledge_review.pdf

From the abstract:

A selective review of the existing theoretical literature related to the economics of knowledge with particular attention to intellectual property, especially in the form of patents.

Note for those seeking the references they can all be found in the economics bibliography found at:

http://www.rufuspollock.org/economics/biblio/

A refactoring of the first theoretical part of my optimal copyright paper has now been published in the December issue of the Review of Economic Research on Copyright Issues (RERCI) under the title: Optimal Copyright over Time: Technological Change and the Stock of Works. A preprint can be found at:

http://www.rufuspollock.org/economics/papers/optimal_copyright_over_time.pdf

While at the EEA/ESEM summer conference, confronted by the multitude of papers and provoked by the comments of Janos Kornai and Assar Lindbeck, I wondered how things had changed over the last half-century. How many more papers (and economists) were there compared to 20/30/50 years ago, and how had the nature and the quality of research change (at least partially as a result as a change in this quantity)?

The first question, while the less interesting, had the advantage of lending itself to ready quantitative analysis, so it was to this I first applied myself. Armed with my AEA password I quickly hacked up a script to scrape total publications per annum out of econlit (note that this obviously does not include my real password) and posted the results as a dataset on OpenEconomics.net.

Having now finally got round to writing it up here (after several months delay), the results can be seen below — as well as in the original plot linked from the dataset page the graph renders via javascript and may have some issues in IE so if you don’t see anything try Firefox …). The key points to note are:

  • Output has more than five-tupled between 1970 and today (~5k to ~28k).
  • There is a suggestion of a structural break in the data around 1990 with the rate of growth being higher after that point.

Thus, it would appear there has been a substantial increase in output — for comparison world GDP has increased by slightly more than 2x 1970-2005 (see e.g. Delong’s estimates and US GDP increased by slightly under 3x (3.7 trillion to 11trillion) over the same period.

The next step is to determine what is driving this increase and how much it actually corresponds to an increase in ‘knowledge’. In particular one should compute output per (economist) capita and look at other possible explanations of the rise, for example: technology (computers and communications), different working practices, institutional factors — e.g. greater focus on published research output (the research assessment exercises). The harder question will be the second item: trying to determine how much this increase in raw quantity actually signifies an increase in real understanding and know-how.

YearNumber of Articles
19705081
19715012
19725685
19735981
19745965
19755997
19766403
19777077
19787573
19797799
19808220
19818420
19828387
19839418
19849552
19859918
19869872
19879918
198810551
198910768
199011254
199111905
199213108
199313493
199414374
199515825
199617692
199718385
199819869
199920818
200021836
200122322
200223331
200323984
200425843
200527738

A new study is out on the relationship of unauthorised downloading and music purchases. The work was carried out by two economists, Birgitte Andersen and Marion Frenz, of Birkbeck College (University of London) for Industry Canada. Entitled The Impact of Music Downloads and P2P File-Sharing on the Purchase of Music: A Study for Industry Canada its description states:

Industry Canada undertook a music file sharing study during 2006-07 to measure the extent to which music downloads over peer-to-peer file sharing networks, for which the sound recording industry receives no remuneration, affect music purchasing activity in Canada. The data used for this analysis are from a Decima Research survey conducted between April and June, 2006, on behalf of Industry Canada. The report, prepared by University of London researchers, Birgitte Andersen and Marion Frenz, found that music downloads have a positive effect on music purchases among Canadian downloaders but that there is no effect taken over the entire population aged 15 and over.

This is a new contribution to the literature examining the relationship of unauthorised downloading and sales which I first reviewed two years ago. The results would clearly support those who argue that the positive sampling effect of unauthorised p2p downloading counterbalances (or even outweighs) the substitution effect (for more on these terms see the review). The effects found are quite substantial, at least when restricted to their P2P downloaders subsample (from the summary of findings)

“… our analysis of the Canadian P2P file-sharing subpopulation suggests that there is a strong positive relationship between P2P file-sharing and CD purchasing. That is, among Canadians actually engaged in it, P2P file-sharing increases CD purchasing. We estimate that the effect of one additional P2P download per month is to increase music purchasing by 0.44 CDs per year” [emphasis added]

However looking through the paper one needs to be a little cautious in taking these results at face value. In particular, the statement in the abstract that “music downloads have a positive effect on music purchases among Canadian downloaders” is a classic case of interpreting a correlation as a causative relationship (this (mis)interpretation is even more baldly stated in the summary of findings — see previous quote above). Given the cross-sectional nature of their data such an interpretation is particularly dubious (as the authors themselves acknowledge in the Data and Methodology section: “… regressions based on cross-sectional data cannot prove causality”).

Furthermore, there is a major problem here with the regression specification: p2p downloads and music purchases may both be driven by an omitted variable — for example interest in music. In that case a simple regression of purchases on downloading activity will be upwardsly biased (i.e. the impact of downloads on purchases will be too high) because those interested in music would then both download more and purchase more. To address this problem you’d need some form of ‘identification’ strategy, probably using an instrumental variables approach. (This issue is very similar to that encountered when doing straight regression of sales on downloads — again the estimated coefficient is going to be upwards biased because both trend (independently) upwards when an album is released.) This problem could be made even worse by focusing solely on downloaders.

Again the authors are aware of this issue but don’t feel they can do much about it (from the end of the Data and Methodology section):

… single equation estimations assume that all independent variables are exogenous and all important variables are included in the estimation. If, however, any of the independent variables are influenced by the dependent variable and/or any of the independent variables, or important independent variables are omitted, then the included independent variables tend to be correlated with the error term leading to inconsistent estimates

Unfortunately, useful instruments are inherently difficult to find and this is why we decided not to use instrumental variable techniques. … …

While it may be true that there was not much they could do about this issue given the data they had it does mean that one should be cautious in taking the regression results at face value — in particular the main finding, for the downloaders subsample, of a substantial positive effect of (unauthorised) downloads on CD purchases, which may simply be picking up an omitted variable (for example, interest in music).

I’ve just noticed this interesting paper posted on Bessen’s Research on Innovation website at the end of September. Entitled, The Political Economy of Patent Policy Reform in the United States, its authored by F.M. Scherer, one the elder statesman of innovation and IP research. As the abstract puts it:

explores a paradox: the extensive tilt toward strengthened patent laws in the United States and the world economy during the 1980s and 1990s, even as economic research was revealing that patents played a relatively unimportant incentive role in most large companies’ research and development investment decisions. It proceeds by tracing the political and evidence-based history of several major initiatives: the Bayh-Dole and Stevenson-Wydler Acts of 1980, the creation of the Court of Appeals for the Federal Circuit in 1982, the Hatch-Waxman Act of 1984, changes in antitrust presumptions, and the inclusion of TRIPS provisions in the new international trade rules emerging in 1993 from the Uruguay Round. An excursion follows into the relatively sudden ascent of the term “intellectual property” as a form of propaganda. Suggestions for further policy reforms are offered.

This is a fascinating read, informed by the fifty years Scherer has been working in this field, and well worth the time taken to read its 52 pages. As just one example consider footnote 15 which reports the reaction of Professor Doriot (of Harvard Business School and the first High Tech VC group the American Research and Development Corporation) to their contemplated research on the importance of patents: “Hell, patents are simply instruments with which big companies bludgeon my startups.” How much, one wonders, has changed in the 50 years since?

The Computer and Communications Industry Association (CCIA), a lobbying group for technology companies, has put out a report entitled Fair Use in the U.S. Economy. The report generates larges numbers:

The research indicates that the industries benefiting from fair use and other limitations and exceptions make a large and growing contribution to the U.S. economy. The fair use economy in 2006 accounted for $4.5 trillion in revenues and $2.2 billion [sic: should be trillion] in value added, roughly 16.2 percent of U.S. GDP. It employed more than 17 million people and supported a payroll of $1.2 trillion. It generated $194 billion in exports and rapid productivity growth.

As a result, and thanks no doubt to the PR efforts to the sponsors, the report has been getting plenty of attention, with its conclusion interpreted as showing that the ‘fair-use economy’ is more important than the ‘copyright economy’. For example, information week quotes CCIA CEO Ed Black claiming that while “the value added to the U.S. economy by copyright industries amounts to $1.3 trillion […] the value added to the U.S. economy by the fair use amounts to $2.2 trillion.” (source plus repetition e.g. on slashdot. See also the Google policy blog).

Unfortunately, while perhaps interesting as propaganda these figures have zero ‘intellectual’ credibility — and , in fact, little basis in the study itself. For all the study actually does is label a whole bunch of industries as ‘fair-use’ related and then sum up their contribution to GDP and Value-Added. Leaving aside their extremely questionable classification of companies as ‘fair-use related’ the basic problem is that the study makes no effort to actually work out whether fair-use was essential to these businesses, or, more specifically, what difference the absence of fair-use would have meant to their profitability or success. Just because a company makes some use of the fair-use exceptions doesn’t mean you can suddenly ascribe its full value to the existence of those exceptions!

Thus there is absolutely no way this study tells us what the ‘contribution of fair-use’ to the economy actually is and certainly no way to make specific statements such as the “value added to the U.S. economy by fair use amounts to $2.2 trillion”. The study’s authors no doubt were aware of this, hence that clever elision in the above quote between industries “benefiting from fair-use” and the “fair-use economy”, with the latter phrase implying much a much more direct dependence on the benefits of fair-use than the former.

Of course it is also true that just as much propagandizing (base on equally poor “research”) is done by those on the other side of the debate (see for example my analysis of the BSA’s piracy claims) but I am deeply sceptical that two wrongs make a right. What we need in debates over IP is not more propaganda but more evidence.

Last week I was at the 2007 Society for Economic Research on Copyright Issues (SERCI) Annual Congress (also acronymed under the label SERCIAC). The event was a nice size with a good mix of people, well organized (a big thank-you here to Christian Handke) and with many interesting presentations (some of which I was able to take notes on — see below). I also had the chance to present my paper on optimal copyright and get some useful feedback. I’ve already mentioned this in a previous post but for those interested you can find the slides from my talk here:

http://www.rufuspollock.org/economics/papers/optimal_copyright_talk.pdf

and the full paper here:

http://www.rufuspollock.org/economics/papers/optimal_copyright.pdf

Rough Notes on Some of the Presentations

Professor Richard Lipsey: Technological Transformations, IPRs and Second Best Theory

Intro

  • Real consumption 10x level 100 years ago
  • Tech change the major component
    • Frequently underestimated in growth accounting because capital growth also includes payment for innovations
  • General purpose technologies
    • Revolutionary
    • Lots of unexpected applications
    • Externalities too … [ed: though this is always true with innovation]
  • ~20 examples of GPTs concluding with suggestions for future (Biotech and Nanotech)
  • Why was it in the West that take-off occurred?
  • Suggests due to scientific culture and pluralism
    • Debated among economic historians …
  • Why science better in Christian West rather than China and Europe?
    • Variety of reasons, many ‘accidental’ e.g. pluralism of Roman state, discovery of Aristotle post reconquista in Spain …
    • Can not ignore historical accident in understanding history of economic growth
  • Institutions are crucial
    • Though the universities: “the West took a decisive and probably irreversible step toward the inculcation of a scientific worldview that extolled the power of reason and painted the universe –human, animal, inanimate– as a rationally ordered system” (Huff 1993: 189)
    • Universities are ‘institutional memory for science’ (plough, because embodies de facto persists in a way that pure knowledge does not)
    • Differed from Islam because more unified and did not succumb to reaction later [ed: though this beg questions of direction of causation]; differed ).
    • China also did not have good institutions to act as this memory
  • Shared (science-based) world-views also important
  • Stuff happens slowly and is hugely cumulative and incremental not ‘out of the blue’

Modelling

  • Two approaches:
    • Neoclassical (Arrow-Debreu DSGE)
    • Evolutionary/Structuralist
  • Neoclassical v. powerful but bad for modelling growth
  • Evolutionary is better for growth
    • Competition is ‘jostling’ not perfect
    • Technology is endogenous
    • Better able to incorporate externalities and non-convexity
  • Bemoans fact that people focus on d/w loss too much and don’t see benefits of monopoly in terms of incentives for innovation
  • Neoclassical advice: (static) remove anything that prevent optimum allocation in perfectly competitive market
  • Evolutionary: like rents as induce more innovation
  • [ed]: things are a little more complex. Most neoclassical growth models do allow for rents for innovation — usually via patents (e.g. Romer) and there are the mainstream schumpeterian models of Aghion and Howitt/Grossman and Helpman
  • Lots of stuff that prevent first best
    • Fixed costs
    • Product differentiation
    • Imperfect and asymmetric information
    • Happiness (man is a social animal)
    • … (endless list)
    • [ed]: all of these are well-known to economists
  • So look at second-best: ‘rules of thumb’ such as reduce the largest distortion
    • Lipsey (2007) provides a whole bunch of objections to these approaches
  • Standard 2×2 matrix of rivalrous vs. non-rivalrous, excludable vs. non-excludable types of goods
  • Cumulative innovation: trade-off current and future (down-stream) innovation
  • Formal models are not enough need judgement
    • Knightian uncertainty …

Evidence

  • How important were patents in history
    • Does not seem intro (or increasing use) of patents led to higher rate of innovation
  • Other examples of GPTs
    • technology came first and then IP came along after
  • Cites Watt/Boulton example of hold-up
    • Leads to invention/diffusion trade-off
  • Government intervention (R&D subsidies and procurement) can be very useful
    • See Ruttan (2001 and 2006)
    • ‘Governments can pick winners’ (but also many failures)

Patrick Waelbroeck: Music Variety and Retail Concentration

  • Sales of music have declined in France and other countries since ~2000
  • Also decline in variety of music
  • Two reasons advanced for decline in sales
    • Piracy (demand side)
    • Less variety (supply side) — focus here
  • Increase in retail concentration
    • More large stores
    • General purpose/food stores selling other goods (Wal-Mart etc)
  • This reduces variety available => Drop in sales
  • Retails reduce variety because:
    • Cost to manage inventories
    • Promotions
    • Competition between products
  • Model:
    • vertical setup with 1 producer and 1/2 retailers
    • competition b/w retailers reduces double marginalization (good for welfare)
    • 3 options: vertical integration (VI), vertically separated monopolies (2M), monopoly in production and competition in retail (MC)
  • Results
    • VI more likely to have variety than 2M (some parameter values where VI has 2 products where 2M does not). Can solve with 2-part tariff.
    • MC leads to more variety than 2M
    • Lack of integration reduces incentive to launch new products
    • Retail concentration hinders product variety and total quantities sold

Marcel Boyer: The Value of Music to Commercial Radio Stations

  • Animated by case before Copyright Board Canada
    • 1997 Canadian copyright act was amended to include equitable remuneration for performer’s and maker’s
    • What is the correct level of equitable remuneration
  • What would a commercial radio station be willing to pay for music
  • Assume CR (commercial radio) allocate time between talk and music such that marginal value of each is equal
  • Estimate share of program content that is music:
    • Total over whole day (0600-0000): 76% unweighted and 75% weighted by number of listening
    • Correct for advertising amount (as this varies greatly with time too)
    • Value attributed to sound recordings by day part:
      • 0600-0900: 25.9% of commercial value (12.95% sound recordings, 12.95% other)
      • 0900-000: 74.1% (49.40%, 24.70%) respectively
    • Comes out at 60% recordings, 40% talk for value generated
  • Now observe payments to talk people (since in accounts)
  • So get what payments for talk = total value x 0.4
  • Hence recordings value = total value x 0.6 = talk/0.4 x 0.6
  • Implies recordings are worth C$265 million gross
  • After taking into account music related expenses of radio stations (not payemnts for the work itself but technicians etc etc) reduces to $127 million
  • Commercial radio sales are $1 billion
  • So copyright payments should be ~13% of sales
  • Current payments were $40 million and should be $127 million

Martin Kretschmer: Copyright Earnings and Risk: An empirical study of writers’ income in Germany and the UK

  • Look at earnings of authors (of text)
  • Where do they get their money from
  • What is distribution of earnings
  • Gini coefficient: measures inequality in earnings
  • UK ALCS payments 2005
    • all authors: 369 mean, 80 median, gini: 0.72
    • author with more than 50% of income from writing: mean 28k, media: 12k, gini: 0.63
  • Germany (note earnings are net of tax)
    • all authors: …
    • author with more than 50% of income from writing: mean 13k, median 8k, 0.56
  • Professional authors): (UK) 40% earn all income from writing (germany similar)
  • Collecting society income distribution is less equal that general writing income (which is surprising given supposed ‘diversification’ role of collecting societies)

Stan Liebowitz: The Impact of Copyright on the Price of Books

  • Compare prices of in and out of copyright books
  • Best-selling books for each year from 1895-1940
  • Collected data on list price, sales price (Amazon), number of pages, binding type, isbn#, type of writing and so forth
  • 2445 observations based on 603 unique titles
    • 280 separate publishers. 188 publishers had less than 2 books in the sample
  • We limited sample to publishers with more than 10 books (23 such publishers) and who published both copyrighted and non-copyrighted books (12 publishers) (and had to have at least 80/20 ratio i.e. min of 20% in copyright, 20% out of copyright)
  • Leaving 872 observations, but many were thrown out because they were e-books or because some data was missing
  • 66% of total is out of copyright (this is higher than number of distinct titles because pd books have more editions)
  • Distributions by publisher: pretty even except for Kessinger which has ~10x everyone else (will turn out to be suspicious)
  • Simple OLS: copyright dummy has no effect (r-squared 0.285
    • robustness: the same (remove outliers)
  • Publisher dummies: copyright dummy has impact of around 13%
    • robustness: back to no significant (remove outliers)
  • Standard royalty contract is 10-15% of price
    • So this looks very efficient because authors get no rents
  • Some of these publishers may be pirates (Amereon, Buccaneer and Kessinger have had complaints)
  • Exclude doubtful publishers and restrict to Simon/Schuster, Penguin and Dover
    • Number of observations are much smaller (72)
    • OLS regression: copyright effect is 22% increase in price
    • Robustness raises this to ~27%
  • Suggests 50/50 split for royalty rates (but might be too high depending on author power)
  • Deadweight losses:
    • 1-2 elasticity of demand with 20-25% copyright impact gives c/dw of 4-6.25
    • at 3 elasticity is 6-9.38
    • at 0.5: 1-3%
    • [ed: depending on how large costs are (and for books they seem large relative to copyright) then need to do some work to convert to ratio of d/w loss to welfare]

[ed]: Really great to see this kind of empirical work being done and this is a nice approach to an important question. Some queries:

  • Looking at average prices rather than lowest price per title (and unweighted by sales) could be a problem
  • Imagine that have a book selling at $10 before going in PD. Goes into PD and now two versions: the official version produced by original publisher still at $10 and a new cheap edition at $5. Then average price has only dropped to $7.50 even though all the new served demand (which was d/w loss before) comes from $4 book. Thus impact of copyright estimated at $2.50/$10 = 25% but in fact was $5/$10 = 50%
  • Heald data is useful on this. He finds that on average public domain books have on average 5.2 (6.2 including ebooks) editions while copyrighted books have an average of 3.2. Turning to a subset of especially durable (popular) works, Heald provides a price comparison, finding that ‘durable’ public domain books have an average lowest cost of $3.85 while copyrighted books have an average lowest cost of $8.05 (restricting to well-known major publishers gives $5.80 and $8.90 respectively). Unfortunately he does not give average price but indicates lowest price really quite different (and restricting to well-known publishers will bias results).
  • Digital world: this focuses on books in hard-copy format
    • But what about e.g. Project Gutenberg. I can get a digital version of PD stuff from there for zero.
    • Furthermore books are the least ‘digitizable’ item (most of us still want the dead-tree version).
    • So assume that for material such as music and film would likely see a bigger effect.

How long should copyright be? Should we increase or decrease the strength of copyright during periods of rapid technological innovation? These are all questions I address in my paper entitled Forever Minus a Day? Some Theory and Empirics of Optimal Copyright, which I will be presenting at the 2007 SERCI Congress in Berlin this week.

For those who want to know more, the full abstract is below and the latest version of the paper can be downloaded from:

http://www.rufuspollock.org/economics/papers/optimal_copyright.pdf

Update: (2007-07-13) there are a set of summary slides available here:

http://www.rufuspollock.org/economics/papers/optimal_copyright_talk.pdf

Update: (2007-07-16) for those interested in republishing, redacting, or otherwise reusing the paper I should state clearly that it (and this blog) are licensed under a Creative Commons Attribution (by) license v3.0

Update: (2007-08-07) I’ve produced an updated version of the paper. This includes a variety of small corrections (typos etc) and some more substantial reworking. In particular due to the reinclusion of the ’small’ middle term in the statement of Theorem 13 and in subsequent calculations (previously in the proof but omitted from the statement as small) optimal term has increased from 14 years to 15 years (which bears out the original assumption that the term was small but it is nice to explicitly include it). There is also a new figure (Fig 1 in the updated paper) which gives the most accurate representation of the results in the form of the probability distribution of optimal term given the parameter range being used.

Abstract

The optimal level for copyright has been a matter for extensive debate over the last decade. This paper contributes several new results on this issue divided into two parts. In the first, a parsimonious theoretical model is used to prove several novel propositions about the optimal level of protection. Specifically, we demonstrate that (a) optimal copyright falls as the costs of production go down (for example as a result of digitization) and that (b) the optimal level of copyright will, in general, fall over time. The second part of the paper focuses on the specific case of copyright term. Using a simple model we characterise optimal term as a function of a few key parameters. We estimate this function using a combination of new and existing data on recordings and books and find an optimal term of around fifteen years. This is substantially shorter than any current copyright term and implies that existing copyright terms are too long.