@CadeMetz @ceciliakang @sheeraf @stuartathompson @nicogrant: How Tech Giants Cut Corners to Harvest Data for A.I.


[This is a must-read, deeply researched, long form article about how Big Tech–mostly OpenAI, Google and Microsoft–are abrogating consumers trust and their promises to creators in a mad, greedy, frothing rush to some unknown payoff with AI. The Dot Bomb boom is dwarfed by the AI gold rush, but this article is a road map to just how bad it really is and how debased these people really are. Thanks to the destruction of the newsroom, only a handful of news outlets can deliver work of this quality, but thankfully the New York Times is still standing. How long is another story.]

OpenAI, Google and Meta ignored corporate policies, altered their own rules and discussed skirting copyright law as they sought online information to train their newest artificial intelligence systems….

OpenAI researchers created a speech recognition tool called Whisper. It could transcribe the audio from YouTube videos, yielding new conversational text that would make an A.I. system smarter.

Some OpenAI employees discussed how such a move might go against YouTube’s rules, three people with knowledge of the conversations said. YouTube, which is owned by Google, prohibits use of its videos for applications that are “independent” of the video platform.

Ultimately, an OpenAI team transcribed more than one million hours of YouTube videos, the people said….

Like OpenAI, Google transcribed YouTube videos to harvest text for its A.I. models, five people with knowledge of the company’s practices said. That potentially violated the copyrights to the videos, which belong to their creators.

Last year, Google also broadened its terms of service. One motivation for the change, according to members of the company’s privacy team and an internal message viewed by The Times, was to allow Google to be able to tap publicly available Google Docs, restaurant reviews on Google Maps and other online material for more of its A.I. products.

The companies’ actions illustrate how online information — news stories, fictional works, message board posts, Wikipedia articles, computer programs, photos, podcasts and movie clips — has increasingly become the lifeblood of the booming A.I. industry. 

Read the post on New York Times.

Search is Ripping Us Off Again, this time for AI

By Chris Castle

In case you think that infringement of copyrights in AI training happens only to rock stars, photographers, record companies or movie studios, let me give you a personal anecdote that might cause you to reconsider. I asked Bing’s Copilot AI tool to list the deal points in a record producer agreement, and this is what I got back:

Copilot

Certainly! When negotiating a producer agreement, several key points should be considered. Here are some common negotiation points from the artist’s perspective:

  1. Scope of Engagement:
    1. The producer is generally engaged for one album project at a time.
    2. Clarify the scope of work, including the number of songs or recordings to be produced.
  2. Budget and Compensation:
    1. Determine the budget for the project (e.g., $10,000 in your case).
    2. Agree on the producer’s compensation, which may include an upfront fee, points on sales, or a combination.
  3. Ownership and Rights:
    1. Address ownership of the master recordings:
      1. Will the artist own and control the masters?
      2. Does the producer retain any rights?
  4. Credits and Royalties:
    1. Specify how the producer will be credited on releases.
    2. Discuss any royalties or backend points for the producer.
  5. Recording Schedule:
    1. Set a timeline for recording sessions.
    2. Define the two-week period for your project.
  6. Logistics:
    1. Determine where the recordings will take place.
    2. Outline the artist’s approval process.
    3. Define the producer’s deliverables.
  7. Termination Clause:
    1. Include provisions for termination if either party breaches the agreement.
    2. Address payment for work completed up to that point.

Remember that each negotiation is unique, and it’s essential to tailor the agreement to your specific needs and circumstances. Legal advice is recommended to ensure compliance with local laws and industry standards1 3 4.

Source(s)

1. An Artist’s Guide to Negotiating Record Producer Agreements – Medium

2. Negotiating a Music Producer Agreement: 7 Key Issues

3. 5 Things to Look for in a Music Producer Agreement – Lawyer Drummer

4. How to Negotiate: Steps, Phrases & Tips to Win the Deal – Selling Signals

Notice that Copilot lists “Source(s)” and that source #1 is an article from the Medium website.

My article.

I’ve written a number of articles about negotiating producer agreements and I can categorically say I haven’t given permission or been asked to give permission for any of those articles to be scraped and used to train AI. So how did it come to be in Copilot’s training materials? 

My guess is that Medium (and probably a lot of other sites) were crawled by Bing’s spider and that copies of Medium pages were stored in Bing’s (or Microsoft’s) data centers. When Microsoft launched its Copilot AI tool, one could imagine that the temptation would have been strong to use those copies as part of the training tools for Copilot. In fact, the Verge reports that “companies around the web are using your site and its data to build massive sets of training data, in order to build models and products that may not acknowledge your existence at all.” Can you say Section 230 or DMCA? Maybe there will be a question at the April 11 House Energy & Commerce hearing on Section 230.

So Google-style “innovation” has a self-destructive flavor consistent with their oikophobic capitalism. Greater Silicon Valley’s view of culture in general and copyright in particular is as sources of wealth extracted by destruction–you know, creative destruction, permissionless innovation, etc. (It’s no wonder Google was inexorably attracted to defense contracting despite all the “don’t be evil” hoorah. After all, what creates massive wealth faster than convincing governments to pay big money to blow things up that must be replaced by ever more big money to blow even more things up.)

Are you surprised then that two of the biggest operators in the AI space are the search engine operators Google and Microsoft? This is another example of how Big Tech helps itself to your data and work product without you even knowing it’s happening. So now what? I now know I’m being ripped off, and I’m wondering if Medium is in on it.

The Verge tells us:

The ability to download, store, organize, and query the modern internet gives any company or developer something like the world’s accumulated knowledge to work with. In the last year or so, the rise of AI products like ChatGPT, and the large language models underlying them, have made high-quality training data one of the internet’s most valuable commodities. That has caused internet providers of all sorts to reconsider the value of the data on their servers, and rethink who gets access to what. 

Ya think?

The First Shot Across the Bow at the MLC’s “Redesignation” Proceeding #TheReup

We must always tell what we see. Above all, and this is more difficult, we must always see what we see.
Charles Peguy

By Chris Castle

The Reup is on! MTP readers will remember that The MLC, Inc. is in the beginning of its “redesignation” proceeding before the U.S. Copyright Office that we call “the rep,” because…because….well, you have to laugh at some point. Having appointed (or “designated”) The MLC, Inc. as the statutory mechanical licensing collective in 2019, the Copyright Office is required by statute to review The MLC, Inc. to see how they are doing with their exclusive monopoly over songwriter streaming mechanical collections.

It’s important to remember that the mechanical licensing collective (lower case) is a statutory body. Congress tasked the head of the Copyright Office with selecting an entity to actually do the work. In a shocker that rocked the industry, the Copyright Office selected (or “designated”) the favorite corporation of the National Music Publishers Association and the Nashville Songwriters Association International that styled itself “The MLC, Inc.” 

The MLC, Inc. then turned right around and selected the Harry Fox Agency as its data vendor to actually run the accounting part of the collective–another shocker. If you thought you were going to escape the hubris and incompetence of HFA under the glorious revolution of the Music Modernization Act, tough break. So it is now the Copyright Office’s decision to either redesignate The MLC, Inc. (and by default, HFA) for another five years of holding onto your money in their vast black box, or find someone else.

And just to be clear, these exclusive appointments or “designations” last for five years. Every five years, Congress required the Copyright Office to take a critical look at the wisdom of their prior decision and determine after soul-searching and self-criticism whether they should ratify their previous genius by extending the monopoly another five years. As Congress said in the legislative history narrative:

The Register [the head of the Copyright Office] is allowed to re-designate an entity to serve as the collective every 5 years after the initial designation. Although there is no guarantee of a continued designation by the collective, continuity in the collective would be beneficial to copyright owners so long as the entity previously chosen to be the collective has regularly demonstrated its efficient and fair administration of the collective in a manner that respects varying interests and concernsIn contrast, evidence of fraud, waste, or abuse, including the failure to follow the relevant regulations adopted by the Copyright Office, over the prior five years should raise serious concerns within the Copyright Office as to whether that same entity has the administrative capabilities necessary to perform the required functions of the collective. In such cases, where the record of fraud, waste, or abuse is clear, the Register should give serious consideration to the selection of a new entity even if not all criteria are met pursuant to section 115(d)(3)(B)(iii).

So the way this is going to go down according to the Copyright Office is that they will seek a kind of thesis defense from each of The MLC, Inc. and the MLC’s counterpart for the digital services called the Digital Licensee Coordinator or “the DLC” which we often forget is there. Then the public gets to comment on how things are going.

Let’s understand how this game is played. Nobody likes to open the kimono and have their operations examined. But opening the kimono is actually a much bigger deal for the MLC than for the DLC. The MLC has a lot of functionality that perpetuates the same old spaghetti code from HFA and the need to hide it from sunlight. In my view the sense of entitlement and hubris is overwhelmingly stronger at The MLC, Inc. than at the DLC. Remember, the DLC pretty much just writes the overpriced checks to keep MLC executives in the style to which they have become accustomed (see Trichordist “Know Your MLC 2022“).

We are starting to get a sense of how the DLC is going to approach the reup proceeding given a recent blog post by Graham Davies, the new head of the Digital Media Association. DiMA essentially is the DLC. Technically, the DLC’s mission is to represent all users of the blanket mechanical license, and I think perhaps for the first time, the DLC will represent all the users both large and small, not just DiMA members. Let’s take a look at some of the points Graham raised.

The Insult of Governance

But first, remember that the MMA created the first US mechanical licensing CMO. This was an event that had been coming for oh, say 100 years round numbers. The first difference between the US and most other countries is that in the US there is not equal board representation between publishers and songwriters. This is an insult to songwriters. 

That’s right–in the rest of the world, songwriters have at least equal representation. Just call it what it is, it’s an insult. And not a casual insult or the insult of low expectations. This insult is right in your face.

There will be a lot of rending of garments about the unfairness of the MLC’s board composition and that’s all fine, but know this: You will not change the board composition until you change the mindset that produced the board composition.

What is astonishing about how this happened is that before they get to Washington, all these publishers with board seats have good relations with songwriters and value their writers. Do we have arguments inside the family? Sure. But something happens to these publishers when they get to Washington, DC and they go rogue or they are encouraged to go rogue. 

So I would encourage these board members to come back to your values and what you hold dear and don’t listen to the bad advice. The bad advice didn’t build your companies; your relations with your songwriters did. Yet there is such hostility toward this board composition that it will take you years to overcome the insult and the distrust it produced. It didn’t have to happen that way and it should not be allowed to continue.

No Free Lunch

The next big difference is that the cost of standing up and operating the MLC is born by the licensees. There is a reason that this doesn’t happen in any other country–it is a bullshit idea. It OBVIOUSLY produced an inherent conflict of interest at the outset. Does it shovel money onto the kitchen tables of the insiders? Of course. Does it feed into salaries, bonuses and T&E of the MLC? Oh, yes. So let’s see what Graham Davies has to say about this one.

For starters, here’s a headline: THE MONEY IS NOT HAPPY. Get it? What do you think happens when the money is not happy? Maybe, just maybe, you think they might not want to keep paying? Maybe just maybe they gave you your lead for five years and let you get good and hooked before they started reeling you in?

As Graham says:

All around the world, it is the rightsholders who bear the cost of the collectives licensing their rights, and copyright offices or similar government bodies often have oversight powers over the collectives to ensure that royalties are distributed fairly and the collectives operate efficiently.  

In the US, unlike anywhere else in the world, legislators placed the burden of funding the collective’s operations on the licensees as opposed to the rightsholders. This particular arrangement was a feature of the statute, but means a collective’s traditional incentives for optimum performance are not inherently built in and may become skewed. [Now there’s a shocker.]

This structure makes it even more important that the Copyright Office ensures fair and efficient operation of the collective, including for those who fund it.

How can you read that and not realize that THE MONEY IS NOT HAPPY. See what you see. Anyone who believed that the licensees large and small would just go on writing the checks for absurd salaries and ridiculous travel and entertainment expenses must be from Washington.

Oversight Culture Clash

This goes hand-in-hand with the true problem with the entire megillah which is where Graham starts: Lack of oversight. Don’t blow past this. 

Remember, DiMA represents the biggest corporations in commercial history and make no mistake–they own Washington, DC. So when the DiMA members look at this oversight issue, from their point of view the government works for them and the government is falling down on the job. The money is not happy. See what you see.

Oversight is a key part of Graham’s complaint.

As we embark on the redesignation process, oversight of the mechanical licensing collective is a key issue. Collective licensing is common for many rights in the music sector, because it is a sensible solution for reducing transaction costs and improving efficiencies between rightsholders and licensees….

The MMA mandated that the MLC be run by a Board made up largely of music publishers and some songwriters. While it makes sense for rights holders to have oversight over a collective of their rights, it has become apparent in the five years since the MMA was passed, that this structure, without guardrails and robust oversight, provides little incentive for the collective to carefully weigh risks and conduct rigorous cost-benefit analysis of decisions before action. [Like any CMO conducts a “rigorous cost-benefit analysis”–try not to laugh, but you get the idea.] This is of great importance because without a clearly circumscribed remit for The MLC, the positions the collective takes can have significant consequences for the functioning of the US music market.

The record shows that in passing the MMA, Congress chose to establish a collective that would serve as the administrator of  the mechanical blanket license….Congress [did not] intend to write the collective a blank check.  Indeed, Congress was astute in requiring that streaming services be responsible only for the reasonable costs of the collective. Such reasonable costs relate to the collective’s core functions – such as work registration and matching. Where The MLC has focused on these core functions, there is good work [no there isn’t], particularly in the context of the relatively short window from designation to operation [already making excuses]. However, where The MLC has gone beyond its remit, there has been, and continues to be cause for concern. Reasonable costs of the collective cannot include everything from traveling to distant countries to conduct outreach to songwriters far beyond the U.S. licensing system, to suing one of the licensees that pays its costs — using licensee money to pursue its allegations against a licensee on a novel legal theory. [This is the Pandora lawsuit filed by The MLC, Inc. I was wondering how long that would take to get under the skin.]

I take Graham’s point and understand his frustration (and discretion in not calling out the ridiculous salaries). But it must also be said that only lobbyists in the Imperial City would have drafted Title I of the MMA to provide for oversight of a private company by a government agency. That’s just idiotic. First of all, it’s really unfair to expect the Copyright Office to supervise the MLC’s travel and entertainment expenses. They barely have the resources to manage their own operations much less have oversight on Kris Ahrend’s tips in transit. It’s also just not in the cerebral culture of the Copyright Office to have the kind of dressing down relationship with the MLC that would be necessary for financial oversight. 

I also have to call bullshit on this complaint about costs being framed as an oversight issue. Yeah, sure, I guess on some level everything is an oversight issue. But if anything, this is an issue for the board of directors at the MLC which includes the DLC. But in most companies it’s a management issue for the CEO and the CFO. So if Graham has a beef about T&E (which sounds like a legitimate beef and is not the first I’ve heard of it), he needs to take it up with the management. You know, the management that reports to the board the DLC sits on (nonvoting or not).

Alternatively, the operating budget of the MLC comes through the Copyright Royalty Board which approves the budget in the form of the “Administrative Assessment.” The DLC can raise these complaints about spending in that forum as well and really should.

So Graham raises some important points that we should be aware of as the MLC enters its all-important reup proceeding. Stay tuned for responses.

Guest post by @TheBlakeMorgan: A musician’s view of the TikTok legislation

Here’s a musician’s perspective on the TikTok legislation before Congress: I hope it passes, both as an American, and as a music maker. (The bill is “Protecting Americans From Foreign Adversary Controlled Applications Act, (HR 7521),” It was recently introduced by Representatives Mike Gallagher (R-Wi.) and Raja Krishnamoorthi (D-Ill.)The bill passed the House by a vote of 352-65, demonstrating deep bipartisan support)

First––this bill restricts TikTok, it does not “ban” the app. It forces the company to separate its ties to the Chinese Communist Party and prevents them from accessing the data of Americans. That’s a good thing.

The bill doesn’t mandate or regulate speech, it’s focused on national security: the FCC called TikTok “a clear and present danger” to our country.

Second––music makers already know what music lovers are just now learning: TikTok is the worst, most exploitative streaming platform for music, anywhere. The vast majority of music on TikTok generates virtually no revenue for the musicians who made it, and even more music on the platform is completely unlicensed (stolen), copied (stolen via AI), or pirated (stolen).

Simply put, TikTok is trying to build a music-based business without paying music makers fair value for the music.

Lastly––musicians (and Americans) are all too familiar with being underpaid and undervalued, with our data being scraped and sold, with platforms which promote hate speech, bigotry, and bullying.

But TikTok does all of this and more, while posing an existential national security threat to our country.

It’s rare to see independent musicians (like me) stand with major labels, and it’s rare to see Republicans and Democrats stand together about anything. But here we are. I hope it passes the Senate and that President Biden signs it.

MLC “ReUp” Proceeding Highlights Ownership Issues for Your Musical Works Database When You Correct To Collect?

Guest post By Chris Castle

Ever wonder who owns the registration data you have slaved to correct and recorrect at your own cost when you “Play Your Part” to “Correct to Collect” at the MLC?

Remember the sainted Music Modernization Act allowed the lobbyists a vehicle to create their mechanical licensing collective in the US that was going to solve all of Big Tech’s problems. The MMA, unsurprisingly, also gave Big Tech a brand new copyright infringement safe harbor arising out of the Spotify class actions. Generations of the children of lawyers and lobbyists will be put through college–thank you songwriters!

One of the few things Congress got right in Title I of the Music Modernization Act is the five-year review of the mechanical licensing collective. Or more precisely, whether the private company previously designated by the Copyright Office to conduct the functions of the Mechanical Licensing Collective (The MLC, Inc.) should have another five years to continue doing whatever it is they do.

Impliedly, and I think a bit unfairly, Congress told the Copyright Office to approve its own decision to appoint the current MLC or admit they made a mistake. This is yet another one of the growing list of oversights in the oversight. Wouldn’t it make more sense for someone not involved in the initial decision to be evaluating the performance of the MLC? Particularly when there are at least tens of millions of dollars changing hands as well as some highly compensated MLC employees, any one of whom makes more than the Copyright Royalty Judges. The MLC’s budget (paid by the services they oversee) was $32,900,000 in 2023 and will be $39,050,000 this year because, you know, the budget is indexed to inflation, just like streaming mechanicals…oh sorry. Not like streaming mechanicals.

Who Owns the Database?

What happens if the Register of Copyright actually fires The MLC, Inc. and designates a new MLC operator? The first question probably should be what is The MLC, Inc.’s plan for a hand off to a successor. But since that doesn’t exist, it instead should be what happens to the vaunted MLC musical works database and the attendant software and accounting systems which seem to be maintained out of the UK for some reason.

I actually raised this ownership question in a comment to the Copyright Office back in 2020. In short, my question was probably more of a statement: ‘‘The musical works database does not belong to the MLC or The MLC and if there is any confusion about that, it should be cleared up right away.” 

The Copyright Office had a very clear response:

While the mechanical licensing collective must ‘‘establish and maintain a database containing information relating to musical works,’’ the statute and legislative history emphasize that the database is meant to benefit the music industry overall and is not ‘‘owned’’ by the collective itself….Any use by the Office referring to the public database as ‘‘the MLC’s database’’ or ‘‘its database’’ was meant to refer to the creation and maintenance of the database, not ownership. [85 FR at 58172, text accompanying notes 30 and 31.]

So if the current operator of the MLC is fired, we know from the MMA and the Copyright Office guidance that one thing The MLC, Inc. cannot do is hold the database and its attendant systems hostage, or demand payment, or any other shadiness. These items do not belong to them so they must not assert control over that which they do not own. Neither does the database belong to any contractor if for no other reason than the MLC, Inc. cannot transfer to a contractor something that the MLC, Inc. doesn’t own in the first place.

Another thing that doesn’t belong to The MLC, Inc. is the hundreds of millions of black box money that the MLC, Inc. has failed to distribute in going on four years. I’ve even heard cynics suggest that the market share distribution of black box will occur immediately following The MLC, Inc.’s redesignation and the corresponding renewal of HFA’s back office contract which seems to be worth about $10 million a year all by itself.

What would also have been helpful would be for Congress to have required the Copyright Office to publish evaluation criteria for what they expected the MLC’s operator to actually do as well as performance benchmarks. Like I said, it’s a bit unfair of Congress to put the Copyright Office in the unprecedented position of evaluating such an important role with no usable guidance whatsoever. Surely Congress did not intend for the Copyright Office to have unfettered autonomy in deciding what standards to apply to their review of a quasi-governmental agency like the MLC? Yet Congress seems to have defaulted to the guardrail of the Administrative Procedures Act or some other backstop to sustain checks and balances on the situation.

But at least the ownership question is settled.

Show Me the Splits: Tiffany Red Illuminates Stealing Publishing

By Chris Castle
(A version of this post appeared on MusicTechPolicy and on Hypebot)

It’s unfortunately an old story, but that doesn’t make it right.

One of the most underpaid creatives in our business are songwriters who “just” write songs. “Just” is an odd word to use but it’s a common way to refer to those who give artists a voice because it really does all start with the song. And as Tiffany Red says in her video, the system is simply unjust.

“The system” is what has always been called “stealing publishing”. This is when an artist or a producer (and it happens with producers but for different reasons) threatens songwriters who created a song the artist may record with not covering that song unless the artist gets a chunk of the publishing. The amount can range all over the place, but often is at least 25% of the copyright. So not only are they not entitled to song’s earnings as a financial interest, they are definitely not entitled to the copyright because they created nothing.

On top of it, songwriters often have to eat many costs in order to get the song written, demoed and pitched. (I can’t tell you the number of times the songwriter demo essentially becomes the arrangement of the final recording, so “demo” is relative.). There’s a bunch of opportunities along the way for people to write themselves into the song when all they did was a job that they were probably being paid to do anyway. I have even encountered producers whose managers demanded a piece of publishing for the producer to even listen to an artist’s demos. 

On the producer side, some producers want a piece of all publishing on the record and if they actually write they want their contributor share as a writer ON TOP of the publishing they are already stealing. Why? What possesses anyone to think they are entitled to do this? And “entitled” is exactly the right word. 

One reason they steal publishing is because the producer royalty is unlikely to even recoup the producer advance in a streaming reality unless the track is a huge hit. (Remember that a producer gets a percentage of what the artist gets, say 30%ish, and the artist gets somewhere around 50% of the fraction of a penny per stream.) This is especially true of producers who enjoyed a lifestyle in the pre-streaming era and are trying to keep it going. It’s understandable, but that doesn’t make it right. 

And remember, the songwriter isn’t getting an advance. On top of the insult of stealing publishing, the artist has no intention of paying for it because the songwriter should consider themselves lucky to get the cover–which often is a career making record for the artist opening up income streams the songwriter never participates in.

When faced with these overreaching demands, songwriters have to make some hard choices. Occasionally I get to tell the artist’s team to fuck off. More often though–as Tiffany says–songwriters acquiesce.

I think Tiffany is also hitting at a point that Merck Mercuriadis made at the last Artist Rights Symposium that David hosts at the University of Georgia:

Let’s face it—this is insulting.  If I sat down and explained to my decent Greek working class parents that this is how songwriters get paid, they’d be shocked.  If you went to your bank manager and explained how songwriters get paid, they’d be shocked.  Doctors, lawyers, everyone who has some understanding of the economics of the world or what drives an industry and what creates value for an industry would be shocked by how songwriters are paid. 

But nobody can bring the frustration home like a songwriter on the receiving end of this injustice. Watch Tiffany’s video. Take 15 minutes out of your life and watch it from beginning to end with no distractions. She’s absolutely correct that until the artists stop, until they let their team know that stealing publishing is not acceptable and if they do it they are not only not helping the artist, but they’ll be fired–then it will start to change. 

She’s right about something else, too. A songwriter shouldn’t need a gatekeeper to protect them in a situation that should not be happening in the first place. There’s a line that we all learn from parents, teachers, coaches, mentors, the line between acceptable and unacceptable treatment of other humans, right and wrong if you like although that’s a bit simplistic. Stealing publishing is wrong, stealing publishing is on the wrong side of that line. This is what I think whenever I have to deal with the situation–how do you sleep at night?

Watch the video. it’s not a rant, it’s the truth.

The Broadcasters are Still Free Riding on Artists #IRespectMusic

Broadcasters set the example followed by Google, YouTube, Spotify and now TikTok. Let’s tell Congress broadcasters have to pay reasonable royalties in the bi-partisan American Music Fairness Act.

Please join the fight and sign the petition at the MusicFirst Coalition!