Turn on Lights and Take Photos

Turn on Lights and Take Photos

Turn on Lights and Take Photos

https://eu.36kr.com/en/p/3678722334679937

Publish Date: 2026-02-11 08:12:00

Source Domain: eu.36kr.com

What would you do with $25 (approximately 173 Chinese Yuan)?

Buy a takeaway meal, top up your phone credit, or casually order a Bluetooth headset? But in the eyes of an AI enthusiast developer in the United States (referred to as Ethan in this article), this $25 is enough to build a “physical world operable intelligent agent”.

He did something that sounds a bit outrageous: He ran the recently popular OpenClaw on a prepaid Android phone that costs $25 – $30 at Walmart. He made it receive instructions via Discord and then directly control the phone’s hardware – turn on the flashlight, take photos for recognition, read sensors, and even attempt to make a call.

What’s even more interesting is that he’s not satisfied with just one phone. Instead, he plans to set up a whole row of phones to create an Agent “phone cluster”.

From Chatbots to “Actionable” Agents

Ethan’s solution is actually not complicated. The core structure is as follows:

● Install Termux (a Linux-like terminal environment for Android) on the Android phone.

● Run the OpenClaw Agent in Termux.

● Call the Android system capabilities through the Termux API.

● Communicate with the Agent via Discord.

In other words, this $25 phone has become an always – online “hardware execution node”. For example, he can send an instruction on Discord: “Hey Claw, turn on the flashlight and then turn it off.” A few seconds later, the phone’s flashlight turns on and then off.

The process behind this is not mysterious: OpenClaw receives the Discord message, calls the Termux API, and then the API calls the Android system interface to complete the hardware operation – things that were originally only possible for apps or system processes are now done by an agent driven by a language model.

In Ethan’s view, what’s really interesting is not “being able to turn on the flashlight”, but “the model starting to have physical execution capabilities”.

Photo + GPT 5.2: The Visual Ability of an Entry – Level Phone

To prove that…

Source