Ask your question


Legal and IP Data Legal and IP Data

What Is Legal and IP Data?

Legal and IP data is the collection of data and metadata about these related subjects. Legal data includes information about cases, judges, jurisdictions, and so on. IP data includes all information about creative works. 

Where Does Legal and IP Data Come From?

This data is, at its core, legal and publicly available. Creative works are, if not submitted as patents to governmental bodies, then subject to legal copyright protections.

Legal data comes from public, though (until recently) sometimes difficult to access case data from law books. Laws and cases are increasingly being recorded digitally instead of physically, and many projects engage in digitizing case texts and making them accessible to anyone with an internet connection. 

Additional legal data may include interviews with lawyers and judges. 

What Types of Columns/Attributes Should I Expect When Working with This Data?

Common attributes of legal data range from metadata like case name, docket number, court and decision date, decision, and jurisdiction. The text of the cases themselves is the actual data. And while clerks and volunteers work to digitize this data, the volume of it makes accuracy a problem for public-access legal databases such as that of the Caselaw Project’s.

Legal data attributes also overlaps with other industry-specific data. A lawyer specializing in banking law requires banking and financial industry data, for example.

Finally, IP data attributes contain trademark, media type, owner name, and jurisdiction. 

What Is This Data Used For?

This data is primarily aggregated into databases for reference purposes. However, developers have made digitization and search functions incredibly easy, with translation, image-search, speech-to-text, and other features. 

That is not all, though: with big data comes artificial intelligence programs. For legal data, AI programs can analyze judges’ decisions and predict their rulings, suggesting approaches arguments that lawyers may find successful. 

Law firms may also use legal data as competitor and market analysis. For example, comparing their rates and compensation packages to other firms in their field and area.

Meanwhile, advances in artificial intelligence signal a revolution in IP law. AI has begun to create: to paint, to write, to compose music. In response, questions about how assign intellectual property rights have arisen all over the world.

How Should I Test the Quality of Legal and IP Data?

Humans have already reviewed most of this data countless times before making it available online, so there is little quality control needed for the data itself. Problems may arise in the digitization and collection of the data in larger databases, mentioned above regarding the public-access Caselaw Project. However, mistakes in public databases are easily remedied.  

Data collected and used by vendors that provide additional analysis and services is more difficult to test for quality. In this case, consider the vendor’s reputation and request a sample dataset to check for completeness, consistency, and relevancy. Timeliness or frequency of update is less important for this category, depending on your field, as new cases may take years to complete and be published. 

Interesting Case Studies and Blogs to Look Into

General Information:

WIPO – World Intellectual Property Organization

Creative Commons: When we share, everyone wins

Caselaw Access Project

Predictice : La recherche et l’analyse juridique

Lexis® Legal Advantage

Case Studies:


Decisia for the Washington Public Employment Relations Commission

European IP Helpdesk Case Study: Freedom to Operate

Tangible Examples of Impact

“Data/analytics and technology are no longer elective courses; legal providers must adopt, embrace, and utilize them to service clients in the digital age. Data and technology are essential components of internal and client-facing delivery.”

Forbes: Why is Law So Slow to Use Data?

AI has started to paint, write news articles, write poems, compose music. This has led to questions about intellectual property rights.

WIPO: Artificial intelligence and copyright

“Our viewership has increased by over 240% since 2007 when we first started putting art objects online. The Walters’ website has received almost one million unique visits this year with the works of art site contributing 24% of that viewership,” said Manager of Web and Social Media Dylan Kinnett. “We hope that this percentage will continue to increase as new users share images on social networks, tag objects and curate their own exhibitions.”

Medievalists: The Walters Art Museum Removes Copyright Restrictions from more than 10,000 Images

Relevant datasets

Business News Americas Legal Solutions

Business News Americas Legal Solutions helps identify projects the need financing, that are delayed and need legal advice.

0 (0)   Reviews (0)

GBG KnowYourPeople


GBG KnowYourPeople provides fast background checks on your employees to keep your organization—including your current employees—safe. KnowYourPeople checks UK Right to Work documents, confirms employee identities, does more than one criminal background check, does a financial background check, and more. 

0 (0)   Reviews (0)

GBG IDscan


GBG IDscan is part proprietary legal document database and part AI color wave tech that can confirm whether a scanned document is likely genuine. Retail companies, police forces, and banks all trust GBG Group’s IDscan.

0 (0)   Reviews (0)

Similar Data Providers

  • The Arabesque GroupThe Arabesque Group
    5 (1)
    Reviews ()
    Data sets (4)
    Established in 2013, the Arabesque Group is a leading global financial technology company that combines AI with environmental, social and governance (ESG) data to assess the performance and sustainability of corporations worldwide. In addition to their Asset Management consultation service, the groups offers Arabesque S-Ray GmbH and Arabesque AI Ltd. datasets.
  • Black Box Intelligence Consumer Intelligence
    5 (1)
    Reviews ()
    Data sets (0)
    Black Box Intelligence Consumer Intelligence is designed to provide detailed analysis on individual competitor sales and performance data.
  • Home by Vendigi
    4.3 (3)
    Reviews (1)
    Data sets (0)
    Home by Vendigi provides audience data for all things home buyers, remodelers, and sellers. Their data comes from first-party sources like top multiple listing systems (MLSs) major brokers like RE/MAX, Coldwell Banker, Century 21, and Sotheby's. Users of Vendigi's Home data range from home and garden retailers to insurance institutions to telecom companies.