Update August 26, 2023: A visual summary is now available in a separated post, you can read it by clicking here.
Abstract. This working article presents RAMP (RoboNet - Artificial Media Protocol) a novel Internet protocol specifically designed to facilitate normative instrumentation for Internet's AI era at a very low technical and political cost. The Protocol defines an operational distinction between artificially generated media and human made content by enforcing that some AI systems and specifically AI generated content to no longer get served by the HTTP protocol, adopting as standard the newly proposed RAMP and its specifications instead. RAMP adoption offers a technical and legal framework for the offering of critical AI data, software and systems, while at the same time that it introduces a reliable industry-wide standard for AI provenance and traffic. RAMP aims to enable a future where individuals have enhanced agency on how they engage with AI systems, streamlining a more responsible engagement between AI service providers and its users by establishing timely technical conditions for a politically neutral and universal AI provenance standard protocol.
Contents
1. Introduction
1.1. Why HTTP/3
1.2. RAMP Instrumental Capabilities
2. AI-generated content: A perspective
3. The non-RoboNet approach
4. RoboNet Artificial Media Protocol
5. Compliance Capabilities
6. Regulation and Standards
7. Challenges and Limitations
8. RAMP’s vision for the near future
9. Conclusion
Annex 1 - RAMP and GenAI text challenges
Annex 2 - RAMP and Content Provenance Techniques
1. Introduction
RAMP is an experimental Internet protocol built “on top” of the HTTP/3 IETF standard specifically introduced for serving most of AI generated content and AI systems that impersonate human behaviour (in a somewhat more encompassing terminology than “generative agents”,) with a focus on transparency and algorithmic accountability. The protocol is a response for a challenge of our times and it aims to update the Internet itself as a response for its AI era.
The year of 2023 sets the bar for the last days of a World Wide Web (www) where the majority of the content inside it is still mostly “human made content” and its digital counterparts roll in leaving this challenging footprint regarding its cultural and intellectual integrity, authenticity and operational accountability. RAMP explores a novel approach to address the technical conditions that are the enablers for most of today’s Internet problems with AI systems and AI content provenance conflicts, and RAMP offers a way to do so before the www becomes a place where the distinction between human and synthetically made content gets blurred forever.
RAMP introduces an opportunity to address the reactive police-making imposed by the asynchronous feedback loop of AI’s innovation pace and the political-regulatory response to such pace. An Internet protocol specifically designed for AI as technology is how RAMP introduces technical conditions that can empower uniform regulatory responses to emerge, so that modern society may engineer optimal strategies to welcome the deployment of worldwide AI’s systems of all kinds.
An unprecedented approach for AI regulation, RAMP targets AI services via its delivery method independently from its capabilities, features or country of origin, which should continue to be addressed at subsequent policymaking stages same as its done nowadays, but then relying on the standardized provenance and compliance mechanisms provided by RAMP compliant vendors.
1.1. Why HTTP/3
Internet protocols are definitely not the common approach towards AI regulation, but traditional initiatives don’t offer the benefit of an independent AI classification scheme that is the same to all countries, industries and regulators. RAMP aims to provide a future proof dual-use instrument by combining technical and regulatory instrumentation that encompasses the AI service delivery stack, and to best prove its capabilities it takes this highly unusual approach: it is basically an HTTP/3 clone, so that both initiatives can help each other succeed and accelerate adoption by Internet industry stakeholders for both.
Standards such as the HTTP don’t have many reasons to get cloned, but as RAMP aims to be as frictionless as possible for Internet users, the use of HTTP/3 efficiency and up-to-date security practices as boilerplate should offer all RAMP requires to grant regulators its instrumental advantages “ASAP”.
Standardized by the IETF as RFC 9114 in June 2022, the latest HTTP protocol update isn’t really new, the HTTP/3 has been around since 2016 and while it is slowly being adopted by Internet websites, 2023 browsers used by more than 70% of all Internet users already support its features and 30% of all human HTTP traffic by 2023 Q2’s end already uses it, as seen on Figure 4.
HTTP/3 was 6 years in the making, by forking HTTP/3 standards in its initial phase, RAMP fast-tracks its adoption rate and also counts on the expertise of thousands of professionals who already have HTTP/3 experience.
But most of all, same as HTTP/3 users don’t type http3:// on their browsers and apps when consuming HTTP/3 services, RAMP users won’t have to type ramp:// or ever even notice the difference from a regular HTTP service if nobody tells them. By operating at the protocol level, RAMP presents itself at Operational System (OS) level, which means every app and website and the very platform OSs are as transparently as possible aware of RAMP capabilities, features and standard, with zero impacts to end users experience.
1.2. RAMP Instrumental Capabilities
RAMP stacks AI service type classification and provenance compliance instruments at AI delivery phase, forging a mechanism that allows for the enforcement of legal agreements specifically crafted for AI, as technically, regulators and authorities may then act on AI products themselves, not the AI companies nor their other non AI services. This particular characteristic means regulators may technically enforce AI constraints for both HTTP vs RAMP or vice-versa, independently from one another.
Technically, RAMP’s core practical function isn’t to differ from how HTTP works, it is instead to deliver a common language that can catalyze synthetic content regulations, AI standard developments, and friendlier AI stakeholder collaborations, as RAMP takes shape as this common instrument that isolates the technical language used to discuss AI’s delivery method, not its capabilities, in a moment where great AI professionals of all walks of life are asking for optimal political response for AI risks:
“Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.” - 2023, Center for AI Safety.
As RAMP overarching capabilities encompasses the entirety of the AI service delivery stack, it simplifies issues that are at the core of the international misalignment regarding shared AI regulations as it addresses the lack of a common ground to target the misuse of AI technologies of today and tomorrow, given RAMP enables all policymakers to target AI without impacting the HTTP Internet itself, a remarkably clean instrument for AI regulators that can reconcile geostrategic timescales efforts.
http://info.cern.ch/hypertext/WWW/TheProject.html
Above: The first Internet webpage ever used http:// and is still online to this day
Same as the first internet webpage ever created by Sir Tim Berners-Lee on top of the HTTP protocol introduced the existence of every online service we all seen to date, modern collective knowledge of what these Internet protocols are capable should be the proper framing to assess future RAMP capabilities, which right now are mostly about standardized consensus and practices, but nonetheless, a foundation for many other future AI initiatives.
By working together with entities looking for modern AI best practices such as those focused on AI media metadata standards, like the International Press Telecommunications Council (IPTC) and the Coalition for Content Provenance and Authenticity (C2PA,) RAMP enabled services can distribute synthetic media and AI systems as a standardized medium for AI industry, not as an intermediary, but as AI’s own digital delivery instrument, crafted specifically for this end from its very start.
But above any implied end user benefits or technical capabilities that are indeed something to be largely explored, RAMP’s role is to become this instrumental catalyst for the enforcement of regulations and standards for AI services, so that countries regulators, policymakers and other Standards Developing Organisations (SDOs) may finally share this common technical object that catalizes initiatives aimed at the several of the known and yet unknown AI technologies potential hazards, while also insulating today’s HTTP Internet from massive waves of ad hoc AI oriented regulatory interventions, that are not only difficult to craft, but difficult to harmonize at the international fora, a must have capability on the wish list of many AI experts that is part of RAMP’s pragmatic essence.
2. AI-generated content: A perspective
Given 2023 developments of GenAI systems such as Stable Diffusion and ChatGPT, a remarkable amount of all our online systems are now exposed to the misuse of similar and new AI technologies, empowering those using such services with malicious intent with capabilities yet to be properly understood. RAMP sharp focus on synthetic media and AI services enables it to interface with such issues before these may impact institutions, organizations and other ecosystems dynamics, giving RAMP a timely set of advantages over other regulatory approaches.
The following list samples two key implications of AI content and systems that arrive at distinct, non chronological moments of AI interactions lifecycle: the type* of issues and how* these are delivered;
The type of issues enabled by AI-generated media:
content that lacks watermarking or industry wide authorship consensus;
training data IP, copyright, fair use and licensing matters;
collection methods, source biases and information trustfulness;
alignment, control and misuse;
The how issues, as AI-generated content:
can be used to attack, corrupt and misuse legacy Internet services;
can impersonate biometrics such as voice or face among other credentials;
can automate misuse of data, techniques and credentials at scale;
can influence elections, media and social dynamics;
RAMP aims to be both technologically and semantically inserted at the digital services stack as a proper delivery mechanism that catalyzes how AI services and content are exposed on the www along with the “legacy” HTTP, providing a clear legal instrument that allows targeted mitigation responses for AI as technology.
3. The non-RoboNet approach
Recent American, Canadian and European developments on AI policymaking tackle major societal wide AI harms by using Risk Analysis principles as an holistic instrument in a very valid fashion, but treating AI traffic data as same as any other HTTP traffic imposes heavy societal-scale burdens on the Internet, where even with our ever improving regulations, it means governments and citizens have to continue to rely on private sector companies and third parties to classify and moderate AI systems and content, demanding new sub-systems, further procedures and crippling institutional policies to tackle AI misuse at subsequent key moments of digital services lifecycles, in a particularly diffuse approach where effective common governance mechanisms at global scale are virtually non existent and even live a moment of infighting for international influence in a standard setting warzone, where private sector and state-led AI governance initiatives going for "AI leadership" compete with non-state-led initiatives (here most notably OECD) and other popular SDOs such as ISO and the IEEE.
Laws, regulations and SMEs of all fields ask for content providers and platforms on the Internet to label synthetic media, to identify chatbots, to address disinformation posted and promoted by automated systems, create content filters for influence-as-a-service entities, address manipulative algorithmic systems, curb the use of deep fakes and AI-enabled disinformation operations. This ever growing list imposes costs to the entire ICT (Information and Communications Technology) stack, with no industry wide established QoS standards, largely because same as the very problem list is new, the way societies react to it is pretty much the same age, on this domain where actual experts are a scarce resource.
And while common regulatory conformity mechanisms such as marked certifications may offer economic incentives for voluntary adherence on policies between economic allies, bad actors don’t share domestic or political benefits for any form of compliance, particularly in the defense industry, where the threats of cyberwarfare and digital espionage find even less reasons to play fair in Internet’s AI era. As an apolitical instrument as the HTTP itself, RAMP protocol is where compliance should — in the medium term — isolate non-compliant AI via simple peer dynamics, a scenario that today technical conditions are simply not built to ever independently deliver.
As such, RAMP understands an AI protocol to have a part to play in the Internet’s ecosystem of devices and networks, so it institutes 5 Core Principles:
RAMP serves Mixed and 100% AI generated content only;
RAMP serves all AI systems and services that provide human like behavior;
RAMP service modelling should promote academic data cooperatives and trusts;
RAMP content served by services using other protocols should be discouraged;
RAMP compliance should have legal and/or binding regulatory incentives;
These principles are aspirational placeholders for industry wide standards and are non vital for either RAMP deployment or even its voluntary compliance /certification by industry vendors, but these aim to offer optimal conditions for the use of RAMP’s many benefits and mechanisms alongside other protocols.
4. RoboNet Artificial Media Protocol
As specified today, all RAMP requests implicitly have a protocol version of “5.0” while RAMP responses have 4 distinct versioning flavors: "5.10" for text/html content, "5.11" for mixed AI content, internally called RAMP1, "5.12" for RAMP2 or 100% synthetic content and AI automated systems, leaving “5.0” for everything else, as defined in section 4.3.2 of the RAMP experimental RFC document:
4.3.2. Response Pseudo-Header Fields
The following pseudo-header fields are defined for responses:
":status": Carries the RAMP status code; see Section 15 of [HTTP].
This pseudo-header field MUST be included in all responses;
otherwise, the response is malformed (see Section 4.1.2).
":version": Contains major and the minor version number. RAMP
responses implicitly have a protocol version of:
"5.10" for text/html content,
"5.11" for mixed AI content,
"5.12" for 100% synthetic content and AI automated systems
"5.0" is assumed for anything else as default.
This pseudo-header field MUST NOT be empty for "http" or "https"
URIs; A version component MUST include at least the value of
"5.0" as default.
Unlike classical approaches, the minor number is not meant to imply extra protocol capabilities, it is meant to offer straightforward content classification independently of status response numbers, so that OS level AI classification may be consumed by any RAMP implementation complexity, no matter how servers/proxies/gateways or applications react to RAMP requests.
Further investigation on how having another RAMP version identifier such as "5.13" exclusively for the text generated by Large Language Models (LLMs) applications is required to better understand this new relationship between provenance and privacy, but in theory identifying LLMs based applications traffic may offer opportunities for a broader ecosystem of apps that inspect or interact with this type of media. More on this specific use case can be found on the Annex 1 of this article. RAMP positions itself at the AI service delivery stack as an application agnostic instrument that in time should also become fine tuned for most popular AI product types.
As an HTTP/3 fork, this article doesn’t further elaborate on its QUIC capabilities, how handshakes take place on first visits and apps, etc. Readers more interested on RAMP technical specifications might as well refer to specific HTTP/3 readings. Readers interested on following how the RAMP RFC evolves over time, may compare RAMP and HTTP/3 specifications side by side using IETF iddiff tool.
5. Compliance Capabilities
Today AI technologies already do work via HTTP and Internet companies are indeed very able to deliver many of the propositions of this article using known tools without resorting to any new Internet protocol — specially one that doesn’t extend or modify current HTTP features — but at present, RAMP’s most interesting capabilities are not directly tied to its technical distinctions, and instead, these are a side effect of the intersection between RAMP and every other protocol, the very foundation of where and when the enforcement of AI regulations may then exist on specific moments of AI systems lifecycle. More specifically: RAMP allows for both AI service providers and AI end users to interact with RAMP compliant AI services as:
a) Defined by RAMP standard specifications;
b) Defined by sovereign legal entities;
c) Defined at corporate or parental policies; and
d) Defined as end user settings; in this order.
The reverse order of this list is how RAMP enables policymaking bottlenecks to exist at every stage of these interactions, as shown in figure 10.
Over time, RAMP independence from other protocols should expand the possibilities of what the many stakeholders may then build with RAMP protocol characteristics and behavior, allowing for an entire set of new solutions to be built atop of its capabilities at both ends of RAMP interactions with other protocols (eg: RAMP vs HTTP or HTTP vs RAMP). This singular characteristic is one of the main drivers for the existance of a specific Internet protocol that treats AI as technology.
As the understanding of how RAMP ecosystem behavior evolves over time, optimal mechanisms specifically oriented at security and reliability of AI data and systems should then have RAMP’s proper delivery mechanisms to grow upon, as RAMP acts as a vehicle for provenance and identification of all AI services.
But where RAMP unexpectedly excels is that by being a distinct protocol for AI systems with OS level compliance capabilities, even offline AI systems can then abide to the device protocol policies, which safeguards that offline devices still abides to corporate, parental and user chosen settings regarding AI content, a paramount feature that no version of the HTTP protocol was ever designed to offer, as RAMP can work independently or not from devices Airplane modes for AI apps that can operate offline.
Finally, there are likely even more complex and possibly unforeseen scenarios involving the misuse of AI in offline modes, particularly given that today's AI systems and apps network traffic have no distinction from those of any other type of app. By offering this enforceable* and distinct protocol, phones and computers OS settings can expose RAMP extra security layer to all apps, that without exposing user privacy or relying on online services may then in the future even enable authorities to use the RAMP to define geofencing policies that may completely disable the use of these AI systems in specific geographic areas, such as schools and prisons, even for offline sideloaded AI apps, a capability no technology or regulation currently available can offer or much less enforce. RAMP introduces the technical conditions that can enable such capabilities with virtually zero technological complexity to all vendors, streamlining a novel human centered security approach that while still imperfect, refines a method and common language for the discussion of these scenarios that simply does not need to impact Internet users on the “legacy” HTTP www for its AI specific interventions.
*Enforcing compliance won't address malicious users from compiling AI systems that bypass or disguise RAMP traffic in HTTP or similar, or even more advanced treats that are specifically engineered to bypass compliance and restrictions, but having RAMP’s protocol focus on the delivery of all AI content and AI systems offers this unique insight instrument for AI traffic behavioral fingerprinting and for the training of algorithms and Firewalls (or similar tools) that can be used to enforce stronger assurances against the misuse of AI technologies — even under the most challenging scenarios — in an automation friendly approach. No other publicly available regulatory instrument can match having RAMP especific insights.
6. Regulation and Standards
RAMP aims to enable a technical and regulatory framework where the traffic of AI services may no longer hide its tracks whenever interacting with the www Internet by being more of common HTTP traffic noise. As the protocol itself handshakes other systems as as what it is, not as everything else, RAMP characteristics isolate AI from every other type of digital events, which simply renders RAMP as a powerful governance mechanism to all AI regulators, as RAMP offers a common “AI off-switch” that is apolitical in nature, on top of all its features.
Not by accident, RAMP should be able to address common risks that arrive with the misuse of new AI systems built with no industry wide standards, as by having no common ground for targeting AI technologies, regulators impose AI content classification and detection on service providers technological choices, budgets and incentives for compliance. RAMP eliminates this step and the usual implications of its misuse such as biases and behavioral doctoring, as AI provenance is then no longer a matter of corporate detection, but a server side mechanism that can be enforced, audited and classified by domestic and international regulations and laws.
As RAMP is a protocol that can also deliver classification of human, mixed and synthetic media with standardized provenance, Internet Service Providers may also benefit from having two novel revenue streams that can deliver distinct and unique experiences for each of these business propositions, thanks to how RAMP standardizes provenance for these types of products. Over time, re-framing the Internet itself with an AI capable business architecture should also forge the proper conditions to foster a less toxic and less geared by poor incentives Internet, as even bot accounts should in the very least, eventually become somewhat more compliant.
As an Internet protocol, RAMP is also less exposed to domestic political frictions: as it is not a law its definitions can remain technical in nature, removing the burden of asking for Social Media Platforms and Internet Service Providers to take a side on political stands regarding computational and artificial advantages of their home jurisdictions political moods for addressing AI misuse. Law may or may not enforce RAMP certification, Internet giants may or may not self regulate by adopting RAMP standards alongside HTTP/3 standards, these are all desirable political responses from regulators, SDOs and other industry stakeholders, not requirements for RAMP to be deployed nor adopted.
Also, this may soon change but AI regulations, even the most recent ones such as the EU AI Act are protocol agnostic, so RAMP inherits all existing legal AI provisions as fluidly as possible, but unlike a law, RAMP doesn't need to be sanctioned or not by Russia to work in Russian browsers and apps, it is an Internet structure update and if countries decide to keep on regulating AI the same way as they already are, nothing needs to be changed, but if non compliant states or entities from these states want to offer their services among compliant entities, RAMP acts as a real life foundation for the enforcement of such policies.
RAMP aims to promote frictionless technical and legal characterizations for the delivery of AI systems and AI generated media, but by doing so at Internet protocol level, RAMP remains as apolitical as the very HTTP is, an instrument with valuable politico-regulatory benefits that should be robustly fit to endure AI consumption challenges over many decades to come, same as HTTP itself was.
Lastly, RAMP allows for lawmakers and regulators to work on constitutional guidelines that cover the whole AI service delivery stack by targeting RAMP alone and how it engages with other Internet protocols or devices at the jurisdictions they’re at. RAMP empowers global AI policymakers to legislate specifically on AI by offering this common starting ground, ending up leaving AI interactions with the www Internet as one of the many possible AI delivery and consumption methods that should eventually emerge, paving the bedrock for uncommon AI platforms such as robots and UAVs to have distinct AI network capabilities and norms, a scenario where the exploitation of technical and regulatory conditions are then no longer a structural imposition to all Internet users, real world first steps towards the separation of military/defense AI and civilian AI, a dense topic where RAMP provides A solution that is as apolitical as it should to address the regulatory harmonization of today AI geopolitical aspirations towards the compliance of AI-native future networks.
7. Challenges and Limitations
Not even the HTTP/3 itself is that much of a popular option yet, not many servers and websites use it at this moment and apparently the use of UDP instead of the TCP allows routers, switches, firewalls, data centers, networks and many other corporate environments to simply block UDP over ports 443 and 80 altogether, which by default makes requests downgrade themselves to HTTP/2 or even HTTP/1.1 versions, which ultimately negates all HTTP/3 benefits such as its speed or security. Tying RAMP to HTTP/3 adoption may accelerate industry compatibility and adoption for both, but that assumes a good amount of coordinated efforts, but this yes, may benefit everyone. The gist of this condition is that over satellites, networks, devices, Internet is this live wild interaction ecosystem of networks and devices (or a network of many networks, aka “ManyNets”) that today have policies that play it safe by consuming mostly HTTP/1.1/2 Internet services, and HTTP/3 and RAMP are about the future, where while still being vastly backwards compatible, both can get “blocked” at various networks interaction stages for non specific reasons. For what is worth, HTTP/3 was created because HTTP/2 could do better, and RAMP follows the same reasoning, but linking uncommon “worlds” together as a strategy for future internet architectures.
QUIC, the underlying technology empowering HTTP/3 (and by consequence RAMP as well) is much more secure for end users mainly because QUIC does not allow plain text (read insecure) communications to take place, everything is encrypted by default and apparently even TLS Proxy Firewalls to this day still have troubles to assess HTTP/3 data. This raises issues regarding compliance (particularly in corporate networks,) and may even empower bad actors to chose HTTP/3 and RAMP as optimal delivery route for yet unknown treats. The upside is that everyone is safer, the downside is that unfortunately, even bad actors.
RAMP also does not magically replaces Intellectual Property rights discussions, as RAMP is merely an encompassing asset that catalizes AI regulatory instrumentation by providing a common delivery asset for the matters of attribution, privacy and other legal dispositions such as ownership and licensing related to AI systems and AI media, it still a regulators job to work on better practices that put such mechanisms to use, same as still platforms job to clearly display RAMP provenance information. RAMP merely offers a common standard for these parts to work on.
RAMP asks for Internet giant stakeholders not for a pause on AI developments, but for an opportunity to update the Internet itself for its AI era, as RAMP allows for a more consistent and encompassing approach that is human centered by design, one that is crafted for cooperation between AI providers of all countries and jurisdictions, a proactive decision that acts as a building block for initiatives that look towards the fair and common safe use of AI, a most needed foundation where the Internet then have a chance to grow with a plan, in the particular apolitical setting where no country have ownership over the instrument itself for perverse or shortsighted misuses, a true shared policy instrument, not a remedy, aligned with IETF’s own mission to promote apolitical standards that make the internet “work” better.
RAMP updates Internet’s own structure, as shown in figures 15 and 16, so that users, companies and governments may enjoy optimal controls for AI as technology, not as an imposition on the internet.
RAMP enables a valuable set of foundational technical assets aimed at addressing today issues related to how AI systems interact with Internet users in a prosocial approach, promoting a safer and more coherent online experience in the long term, as it protects Internet users from geographically disconnected AI regulatory pace and the lack of optimal incentives Internet Companies and Services have budgeted for the misuse and abuse of their services. But that’s its limits, countries and their regulatory bodies would surely still face unintended consequences and challenges for their actions same as for their inaction, RAMP merely updates the dynamics, doesn’t eliminates problems.
Also, while RAMP presents a promising solution for addressing the enforcement of AI regulation and provenance, it is crucial to recognize that as a novel approach, ongoing research and collaboration are essential to refine and enhance its capabilities. RAMP should be treated as a technical instrument, similar to how HTTP is treated, focusing on its functional aspects rather than engaging in political debates. This approach ensures a safe and apolitical deployment of RAMP, enabling its full potential to be realized in promoting fairness, transparency, and accountability for the delivery of AI systems. By engaging in continuous research and collaboration, RAMP can be further optimized and adapted to an ever evolving landscape of AI technologies and regulations, fostering an environment where the benefits of AI can be harnessed both responsibly and ethically by all.
Finally, RAMP empowers policymakers, regulators, scholars, and institutions to closely observe and analyze the behavior and evolution of the protocol itself. This unique perspective allows for a comprehensive understanding of how AI is collectively consumed worldwide no matter where or how. By embracing RAMP, these stakeholders may then monitor the protocol's characteristics over time and gain valuable insights into the responsible deployment and usage of AI. RAMP enables AI systems to operate and evolve at their own digital pace, while providing the necessary framework for policymakers to shape regulations and foster a safer and more accountable AI ecosystem.
8. RAMP’s vision for the near future
RAMP was conceived because I understand that over time most AI systems should evolve into intelligence streaming services, assuming AI as a Service (AIaaS) as its final commercial form from 2024 on. Not to say commercial AI services won’t work offline (eg. autonomous driving platforms,) but a significant amount of AI systems should not only become this hybrid of offline/online capabilities, to me it is also easy to assume today’s and tomorrow intelligence streaming services to be offered as subscriptions to end customers.
Even with the free tier business models available for products like ChatGPT and similar, the tendency is that after the now natural business model of operating at loss settles for those who can afford (read endure) the dotcom common oligo-duopolies war period, from autonomous systems to personal assistants, a monthly fee is how all digital systems have evolved to date and an argument for something specific for AI lacks credibility from a commercial perspective. But it is wise to separate this assumption on the technical perspective.
RAMP positions the protocol concept at the most reasonable locus for AI as technology, not as a digital traffic imposition among all other internet traffic, which then renders AI as something that can be delivered as a customised technology for end user devices, allowing RAMP compliant implementations, such as those inside industry OSs (today Android/iOS/Windows/Mac and Linux for most cases) to offer AI systems as an end-user feature, here illustrated as a toggle button with 3 states:
ON [akin to today’s mostly human content Internet]
Mixed [mix /composite human /AI content] [IPTC 1]
RAMP also yields the proper military/defense channel for public private PPPs and provides a proprietary environment for future AI systems of networks, a valuable operational model that can from start offer a non civilian environment for RAMP compliant services, as the major part of RAMP complexity is handled by key field operators, not SME’s, unicorns or individuals. RAMP is also where systems that happen to be classified or identified as High Risk AI systems by regulators and policies can then have this bounded environment to be used within reasonable assumptions of compliance among allies, fostering a proper framework that can apolitically evolve to all countries and governance models, a most valuable foundation for a world where AI use can become as common as operational systems are to any computer or phone, AI’s natural evolutionary pathway.
And as an Internet protocol itself, RAMP becomes an additional asset to address contemporary Internet perils such as the ones explored nowadays by those who misuse Social Media for the means of mass manipulation schemes, behavioral experimentation and the informational hazards caused by the “traditional” and new disinformation “business” models, denying operational conditions for such services to have a possibility to continue to evolve using AI technologies. RAMP denies all current Internet malpractices the possibility of evolution, intentionally rendering most of today’s digital treats to eventually become obsolete, HTTP can’t do the same.
9. Conclusion
This article introduced RAMP as a dual-use technical-regulatory instrument where AI content origins can be reliably traced, verified and audited tied to its network delivery method, empowering Internet users and regulators to discern between AI-generated and human-created data and systems. By shifting the burden of AI content classification from Internet Service Providers internal technologies to an uniform global Internet Protocol standard, RAMP offers a novel approach for the enforcement of AI policymaking that can in fact deliver fairness, transparency, and accountability, all while granting semantic mechanisms that can better protect human rights and freedom of expression online. Provision of the RAMP protocol asks for a modest synergy between AI providers, regulators and Standard Developing Organisations towards its final specification and deployment, but even so, it still ends up being a political and technical effort that is orders of magnitude simpler than orchestrating common and meaningful international regulations for today and tomorrow AI challenges. RAMP offers the Internet an unique set of benefits for its AI era, one that can be crafted and delivered for a relatively small political and temporal effort, that also imposes no cost and no impacts on Internet users everyday life.
Annex 1 - RAMP and GenAI text challenges
Disclaimer: RAMP is a protocol. It is meant to be a foundation where solutions can be built upon, RAMP is not a solution on its own. The HTTP enabled streaming services and online gaming to exist, it was the foundation for the whole e-commerce and now does it again by serving the entire Metaverse. Even after decades HTTP is still the bedrock for pretty much everything we see online. RAMP is HTTP for Internet’s AI era, so asking “what can RAMP do for us” may not the right question to ask. What I explore on this Annex is A response for one of the biggest challenges of Generative Artificial Intelligence, but as a protocol, RAMP limits lie at the hands of its users.
Disclaimer 2: This is an educational exercise that explores fictional scenarios to illustrate how the RAMP protocol features can change dynamics for the challenges imposed by Generative Artificial Intelligence misuse/abuse, particularly regarding plain text generation. While techniques explored in this Annex are based on real world concepts, services, brands and technologies to best illustrate RAMP capabilities, real world deployments ask for much more intimacy with the themes superficially covered here, this is not a cake recipe.
Introduction
This exercise introduces a method for provenance validation of text generated by Language Services offering Large Language Models (LLMs) or similar Generative Artificial Intelligence (GenAI) technologies for Internet users.
The RAMP protocol implement methods for content provenance by making sure files and systems served by RAMP Servers have a standard “response code” associated with metadata manifest data carried by these files or the AI services they provide. Texts generated by Language Services have no file were we can register this information on and this is a challenge many nice folks are trying to address in order to somehow enforce text provenance policies with these services.
According to RAMP principles, Language Services fall into its RAMP2 category as “systems that generate human like behavior” and as such, messages coming from RAMP compliant services should be using “RAMP/5.10” as version for its responses, but this Annex explores its “5.13” idea as mentioned in RAMP main article:
… further investigation on how having another RAMP version identifier such as "5.13" exclusively for the text generated by Large Language Models (LLMs) applications is required to better understand the relationship between provenance and privacy, but in theory identifying LLMs based applications traffic may offer opportunities for a broader ecosystem of apps that inspect or interact with this type of media. RAMP positions itself at the AI service delivery stack as an application agnostic requirement that in time should become fine tuned for compliance of most popular AI product types.
So to be clear, what RAMP does differently here is that provenance comes from RAMP compliant services at the delivery stage not in the application stage, meaning it is not apps or online services job to tell what is or isn’t a text generated by firms offering this as a product, this is RAMP’s provenance features in action, an apolitical and non “captured” instrument that by principle can deliver optimal privacy conditions at all stages of this Annex propositions, as for what is checked here are random strings of scrambled text from the original data, where neither the contents nor the language the text being checked was made matter for the AI provenance attribution validation proccess.
The technical approach explored here is based on Apple’s controversial CSAM Detection technology and readers of its technical summary [pdf] document should have a chance to see a real world example of everything done here, understand the similarities and the overarching technical orchestration required to make it all work.
Apple’s approach was dependent not only in their own devices ecosystem, but also on their iCloud service as bridge to “detect” content at the same time that Apple devices sync with iCloud services for backups, stacking capabilities to existing services in order to avoid creating new “solutions/problems”. Both the target use of this technology and the way Apple sold this capability made the “feature” almost instantaneously go back to the drawing board, with Apple’s NeuralHash (Apple’s hashing function) ending up labeled as not ready for the prime time, as something that while efficient, was far from an ideal privacy friendly solution.
Technologically speaking, no matter if the data is an image as for Apple, or a song as for the famous app Shazam, or a text as in this Annex, data fragments are somehow converted to a matrix containing hashed segments of said data. Here, we can take a single sentence or an entire book, take parts of it without respecting word boundaries, punctuation, language or meaning and store tokens that represent that segment, or hashes, and store these into a non linear index or some other type of database.
With such structure in place, validation occurs with much less operational overhead and privacy was never even questioned.
RAMP approach here also have dependencies but these do not invalidate the technique itself as it did for Apple, for analyzing the text generated by Language Services shouldn’t be as taboo as looking at people’s pictures. RAMP focuses on the text generated by Language Services with provenance granted by RAMP itself, no actual text people type on their devices is ever target of the concepts explored on this Annex, RAMP was designed to promote independent fairness and privacy and this exercise abides to these principles.
In full disclosure, one of my concerns about the approach explored here is about this central checking structure I named “Issues Centre” (sorry for the name, I just needed something as such without taking sides). This is a place that somehow holds authority to send hashes to both Language Service providers and edge devices when the right set of conditions are met. While I perhaps should elaborate on this institution and its functions, this Annex isn’t about politics or AI regulation itself, so to keep things simple, this entity have only two more assumptions:
The type of institutions requesting check requests at the Issues Centre is an interesting asset to have, so that distinct approaches, practices and policies can be instated for the checking procedures, this should be handful for policymakers. One of the possible instruments for such classification standards could be the UN International Standard Industrial Classification for Economic Activities (ISIC - Link), that can be used to identify industry specific types of requests with a standardized code. This concept is explored so that as in “Scenario 1”, only specific types of institutions may work with specific types of documents validation, so that function creep may be curbed;
The Issues Centre may only query edge devices after querying Language Service vendors with positive results, as a dual confirmation method as its core and only purpose;
How the Issues Centre query edge devices and other technical specifications about its capabilities are omitted. What follow are other assumptions required as worldbuilding assets.
Operational assumptions
Several assumptions are made to make this exercise work. In a nutshell, we are mostly talking about databases talking to each other, but real world deployments are packed with challenges that shouldn’t matter for now. If you believe otherwise, send me a message and I’ll update things here.
Generative AI text services
RAMP compliant Language Services only are assumed, which means these firms to be using "5.13" as version for its services responses to clients, no matter the platform.
It is also assumed these services have proper data retention policies in place, that would allow attribution to be credited to users abusing their services. There are many considerations on this topic alone but it is assumed this sector complies with their field policies and standards for both data privacy and data retention as well.
It is not assumed here that Language Services hold searchable records of their customers texts, but tokens, vector embeddings, semantic hashes or similar technologies representing the texts generated by their services are indeed a requirement for this architecture to work. Keep this in mind whenever “hash” or “hashes” are mentioned here.
Our main fictional Language Service firm is called LLM_Firm_1.
Databases
Three distinct databases are assumed: The ones that belong to Language services, a new one linked to edge devices OS and the one used by the Issues Centre.
It is wise to assume each of these to have particularities that should work as “sanity check” for this exercise. At first glance these are:
Issues Centre Databases:
Only members certified by authorities may add new records;
Database is available to registered members from all Countries;
Edge devices Databases:
Can’t be accessed/erased by users;
Users don’t know what’s in it or when anything was added;
These databases are populated with random, non linear, semantic hashes. Encryption is advised but nothing in these databases is intended to ever be reverse-engineered, this is not their purpose. So, sample database architectures may look like:
Technology overview
Each part takes ownership of their own databases and only under the correct circumstances it is assumed the Issues Centre Service have the means to communicate with parts databases.
Issue center interactions
Whenever a check request is filled, the Issues Centre adds it to its push list that is then submitted to certified Language Services. These services should receive a daily batch of distinct set of hashes or similar.
After some time, positive feedback should emerge among providers, so only then, singled providers start receiving Issues Centre attention for subsequent challenges.
After a sequence of random tokens gets indeed validated, it should be safe to assume that text came from some specific user account from that vendor. Further inquiries can then be made on the edge devices belonging to the vendor user account as double checking procedure for authorship, but at this stage provenance have already been validated for that particular text with that particular vendor.
Some would say: “Everything so far can be done without RAMP, so why bother?”.
RAMP provenance means only text delivered via its protocol to apps or websites may be added to the databases listed here. This is true for both the Language Service providers themselves and people receiving their text/voice messages, as proposed edge device database is managed by the OS not any app. Under RAMP standards, nothing the user types ever gets sent to any of these databases ever, period.
In HTTP there is no independent provenance trail for the OS to validate apps reporting about what should go or not into these databases, giving room for this capability to be exploited by malicious entities and internet certification cartels, and then it would be vendors and OS makers job to attribute provenance and these platforms already hold too much decision power at their hands. RAMP can isolate Language Service texts at delivery method from all other types of Internet traffic and provide a privacy friendly alternative for the validation of Language Service generated texts no matter how and where they are used, and only that.
I also believe eventually people will write exact sentences as those generated by Language Service vendors, this is why the concept of institution type was brought in, so that context matters for the validation, specially regarding about who is asking about it. It is also why edge devices vector indexes is an assumption, as this offers a method to double check context with a proportional validation process where the content of what is being checked is indeed irrelevant.
Validation scenarios
The following validation scenarios make use of this Annex technical assumptions to test its capabilities on validating if a simple text is or isn’t generated by known registered Language Services vendors.
I’m not sure if these are the most politically correct examples nor if they are really really correct as I’m really stretching my personal skills to orchestrate every cross-domain I’m touching here, so don’t be so hard on me, I’m just a guy.
Scenario 1
Alice cheats on her essay, almost half of it was generated by OpenAI’s ChatGPT and her professor expressed his sentiment to the board. After submitting the document for further inspection with the Issues Centre, a positive is indeed returned from a single vendor. It did not matter that Alice didn’t copied and pasted from the website into her essay, the validation takes random parts of the text that somehow were also present at that vendors own vector databases. After vendor confirmation, a second wave of hashes was also sent to Alice’s personal laptop that also confirmed the previous result, a 100% private double positive.
Scenario 2
Alice is having a breakdown and decided that embarrassing her ex Bob on his socials would make his new girlfriend bail and she has a plan to do so! She will post a racist message on his Twitter on his behalf. Alice is not really a racist too so she uses a “prompt hack” she found on Discord that makes a particular Language Service enter into a sort of “hater mode” and she does get her message indeed, a well crafted highly xenophobic sentence. But Alice knows that somehow these websites know when you copy their texts so she takes the pen and writes down word by word of what she sees on the screen so that later that day, she may send that from her ex phone while he is distracted by the game, the perfect plan she thought: “how would they know it’s generated if I’ll type it myself into someone else’s phone!”.
Alice succeeds and Bobs tweet ends up reported. This is a felony matter and Bob needs a lawyer now. Alice’s plan worked. Trying to prove the innocence of his client who never sent that text, Bob’s lawyer sends the suspicious tweet for analysis as probable fake/impersonation content:
Same as Apple’s approach, the techniques explored here are meant to enable detection capabilities without exposing user text or personal data to third parties, and while other important topics such as encryption were not even mentioned (all RAMP traffic is encrypted by default anyway,) I hope that RAMP capabilities explored here can help readers assess why RAMP can play a significant role for AI services of today and tomorrow, no matter the end user application, promoting safer interactions free from market players who could abuse or misuse AI safeguards.
Annex 2 - RAMP and Content Provenance Techniques
Guided by its core principles, RAMP updates the Internet specifically for AI so that the very Internet keeps up with the times, as emerging AI technologies don’t ask for permission to explore the pitfalls of legacy governance models in all places digital. RAMP distribution mechanism allows it to provide a clear content type distinction feature for how today’s AI systems offer their services, a distinction that may eventually even become an unavoidable necessity.
The benefits for bootstrapping RAMP architecture are many, e.g.: artists showing their art online can then have 2 proprietary streams to pitch their craft with the provenance mechanisms offered by HTTP and RAMP1 (for mixed authorship) all while 100% AI art is safely isolated at the RAMP2 access tier, so users browsing for art may safely identity what kind of artwork they are seeing no matter the website or app. So at the same time that AI generated content provenance and authorship information gets isolated from its contents for both end users and online services serving them, AI distinction is available without relying on vendor-locked watermarking, content scanning schemes built by third parties, and other privacy threatening techniques, as RAMP compliant services serve media with this type of classification at the protocol request’s “response header” level.
As RAMP provides transparent classification of AI content types getting served by applications at such earlier stage, RAMP promotes a resilient offering of specific types of AI data and systems in a way that reduces the overhead to deliver specific propositions, such as apps that serve only 100% AI generated media, enabling cost saving conditions that should facilitate commercial distribution of AI systems.
Internet Servers understand the type of media they are serving, particularly those that are serving specific AI applications that generate AI media. RAMP doesn’t change how end users download images or music, but requires Servers — no matter the AI application they are serving — to be clear to Clients at the “response header” level if that response brings some mixed AI, AI systems or 100% AI generated media. This allows for plenty of technical room for sub classifications at application levels at the most distinct stages of “Alice vs Bob” interactions.
New standards for Generative AI media metadata are still largely unknown by the public, but companies as Google, Midjourney and Microsoft are already adopting IPTC /C2PA specifications for media generated by AI and the RAMP protocol can encapsulate these with its standardized provenance with zero friction, offering even more compliance mechanisms for digital assets consumption and training by AI systems, such as those specified by C2PA metadata, here sampled in CBOR diagnostic format:
// Assertion for specifying whether the associated asset and its data
may be used for training an AI/ML model or mined for its data (or both).
{
"entries":
"c2pa.ai_inference" : {
"use" : "allowed"
},
"c2pa.ai_generative_training" : {
"use" : "notAllowed"
},
"c2pa.data_mining" : {
"use" : "constrained",
"constraint_info" : "may only be mined on days whose names end in 'y'"
}
}
As seen in Figure 2, when we look inside the hood of a scrapper engine crawling the www, Servers see not many distinctions from average browsing, user agents and other things come to rescue but Internet wise, nobody really watches what bots do or not as long as they don’t cross the barrier to be considered a treat, given that as far as definitions go, defining a bot characteristics and intentions can be vastly tricky.
HTTP/3 can do lots things better than its predecessors, including serving common HTML pages for scrappers. On Figure 3, a scrapper is configured to abide assets metadata “AI_inference” classification, for media that allows its consumption by AI inference use. Here RAMP goes a step further: it offers a chance for AI made and Mixed AI content to be served as such, a much more granular and particularly relevant classification that is specific for AI, on top of files metadata information:
With clear provenance distinction, trust can be broadcasted at protocol level so that Internet users may no longer need to rely on domestic laws or corporate policies to trust if a content is authentically human made or not, classification complexity is exposed on RAMP protocol response to Clients, that then can do as they please with that data. In such position, RAMP provides an unique opportunity for AI Service Providers to benchmark and build trust on AI systems, as its robust framework offers a much needed foundation for responsible and ethical practices, where Internet users should then enjoy a chance of no longer have to get exposed to artificial content or artificial systems by mistake or malicious intent.
RAMP reframes ad hoc AI content provenance policymaking, so that Internet platforms such as those of Social Media apps can rely on RAMP’s up-to-date standard to build their user experience upon, at the same time that regulators may then stop relying on same platforms as intermediaries for AI content classification.
Hi Antonio, my name is Felipe Casali. I’m a IT professional with 20 year of experience, and currently I’m finishing my MBA in IA & Big Data. My thesis is about “Protecting People’s Personal Data in the AI era”.
I’m looking for alternative proposes around the AI communities and I found your proposal very interesting.
I wanted to mention your wok in my thesis, as one of the possible approaches to protect and govern AÍ data.
If you agree, we can discuss more details by email. felipecasali@usp.br