Ask your question


Legal and IP Data

What Is Legal and IP Data?

Legal and IP data is the collection of data and metadata about these related subjects. Legal data includes information about cases, judges, jurisdictions, and so on. IP data includes all information about creative works.

Where Does Legal and IP Data Come From?

This data is, at its core, legal and publicly available. Creative works are, if not submitted as patents to governmental bodies, then subject to legal copyright protections.

Legal data comes from public, though (until recently) sometimes difficult to access case data from law books. Laws and cases are increasingly being recorded digitally instead of physically, and many projects engage in digitizing case texts and making them accessible to anyone with an internet connection.

Additional legal data may include interviews with lawyers and judges.

What Types of Columns/Attributes Should I Expect When Working with This Data?

Common attributes of legal data range from metadata like case name, docket number, court and decision date, decision, and jurisdiction. The text of the cases themselves is the actual data. And while clerks and volunteers work to digitize this data, the volume of it makes accuracy a problem for public-access legal databases such as that of the Caselaw Project’s.

Legal data attributes also overlaps with other industry-specific data. A lawyer specializing in banking law requires banking and financial industry data, for example.

Finally, IP data attributes contain trademark, media type, owner name, and jurisdiction.

What Is This Data Used For?

This data is primarily aggregated into databases for reference purposes. However, developers have made digitization and search functions incredibly easy, with translation, image-search, speech-to-text, and other features.

That is not all, though: with big data comes artificial intelligence programs. For legal data, AI programs can analyze judges’ decisions and predict their rulings, suggesting approaches arguments that lawyers may find successful.

Law firms may also use legal data as competitor and market analysis. For example, comparing their rates and compensation packages to other firms in their field and area.

Meanwhile, advances in artificial intelligence signal a revolution in IP law. AI has begun to create: to paint, to write, to compose music. In response, questions about how to assign intellectual property rights have arisen all over the world.

How Should I Test the Quality of Legal and IP Data?

Humans have already reviewed most of this data countless times before making it available online, so there is little quality control needed for the data itself. Problems may arise in the digitization and collection of the data in larger databases, mentioned above regarding the public-access Caselaw Project. However, mistakes in public databases are easily remedied.

Data collected and used by vendors that provide additional analysis and services is more difficult to test for quality. In this case, consider the vendor’s reputation and request a sample dataset to check for completeness, consistency, and relevancy. Timeliness or frequency of update is less important for this category, depending on your field, as new cases may take years to complete and be published.

Interesting Case Studies and Blogs to Look Into

WIPO – World Intellectual Property Organization
Caselaw Access Project

Tangible Examples of Impact

“Our viewership has increased by over 240% since 2007 when we first started putting art objects online. The Walters’ website has received almost one million unique visits this year with the works of art site contributing 24% of that viewership,” said Manager of Web and Social Media Dylan Kinnett. “We hope that this percentage will continue to increase as new users share images on social networks, tag objects and curate their own exhibitions.”

Medievalists: The Walters Art Museum Removes Copyright Restrictions from more than 10,000 Images

Relevant datasets

Google Dataset Search

by Google_logo

Google Dataset Search provides quality, continuously-updating data of all kinds for both researchers, data analysts, journalists, and the general public. They aim to enable the free and open discovery of all kinds of data and metadata in the world.

The platform also offers a Dataset Developer Page to help people add structured data to their datasets or to resolve any other problems.

0 (0)   Reviews (0)

Premonition Litigation Database

by Premonition

Premonition Litigation Database standardizes civil court documents from across the US. Users can search for data by judge, keyword, etc.

0 (0)   Reviews (0)

PatentSight Data & Analysis

by PatentSight_via_LexisNexis

PatentSight Data & Analysis tracks patent applications and news, provides risk assessments, and performs disruptive technologies analyses

0 (0)   Reviews (0)

Similar Data Providers

  • The Arabesque GroupThe Arabesque Group
    5 (1)
    Reviews ()
    Data sets (4)
    Established in 2013, the Arabesque Group is a leading global financial technology company that combines AI with environmental, social and governance (ESG) data to assess the performance and sustainability of corporations worldwide. In addition to their Asset Management consultation service, the groups offers Arabesque S-Ray GmbH and Arabesque AI Ltd. datasets.
  • Black Box Intelligence Consumer IntelligenceBlack Box Intelligence Consumer Intelligence
    5 (1)
    Reviews ()
    Data sets (0)
    Black Box Intelligence Consumer Intelligence is designed to provide detailed analysis on individual competitor sales and performance data.
  • Home by VendigiHome by Vendigi
    4.3 (3)
    Reviews (1)
    Data sets (1)
    Home by Vendigi provides audience data for all things home buyers, remodelers, and sellers. Their data comes from first-party sources like top multiple listing systems (MLSs) major brokers like RE/MAX, Coldwell Banker, Century 21, and Sotheby's. Users of Vendigi's Home data range from home and garden retailers to insurance institutions to telecom companies.