Reverse Engineering the Emporia Vue Utility Connect
I got the Emporia Vue Utility Connect with the intent of using jrouvier's port to ESPHome, but it turns out my model doesn't communicate the same way as the version jrouvier had. Specifically, upon flashing ESPHome onto the Vue I was greeted with this error:
That sucks, but should be solvable right? Let's go on a journey!
What are the elements we care about?
Let's start with scoping out the actual elements at play here: the ESP32 chip, the MGM111 chip, and the utility power meter.
The ESP32 chip is the main microcontroller in the Vue device and communicates with the MGM111 chip over UART.
The MGM111 chip speaks Zigbee which is how it communicates with the power meter.
When installing the ESPHome firmware, we are only modifying the ESP32 chip and need our firmware to transmit the same UART commands to the MGM111 chip that the stock firmware would send. We would then need to make sense of the responses.
Is the payload itself an error message?
The payload was expected to be 152 bytes, but mine is 44 bytes. It's significantly smaller, so maybe it's just an error response? How to try to confirm this?
Let's try to induce an error. As seen in the code, a meter-join request is sent before requesting meter readings. Logically, if we try to request a reading without requesting to join the meter first, that would return an error, right?
Turns out the read request returns a 44-byte payload like before. Dang, so it is an error. It also looks almost identical to the previous payload.
Is the meter join request even successful?
Let's try to induce an error at this stage.
We can try preventing the Zigbee communication with the meter from happening, but we also need the ESP32 to still be able to connect to my wifi and show its logs. How to do this... Let's stick it in a metal bottle!
I can just point the mouth of the bottle near a wifi AP and away from the utility meter. So what did that do?
The meter-read request no longer gets a response, so we've successfully broken the Zigbee communication, but the meter-join request still returns a 1. That is suspicious because 1 happens to be the same value it returns even when there is a Zigbee signal. This has proven to be kind of a dead end. What now?
What if we can confirm the stock firmware gets the same responses?
The ESP32 and MGM111 communicate over a UART bridge, so in theory we should be able to sniff the traffic. Let's flash the original firmware back on to see how it behaves.
Fortunately, jrouvier already documented the pins of interest on the board, which for us would be the P5 group of pins. Time to sniff!
Here is the communication for requesting a meter reading:
request : 24720D
Looking at the docs again, if we take out the header and footer bytes we can see that it is the same 44-byte payload I saw while running ESPHome, so that payload is expected and valid! Fantastic, now to analyze the bytes.
Identify the bytes of interest
Let's compare three readings to see which bytes change between them:
There are only 3 (plus an incrementing byte) groups of bytes that ever change, so what are they?
Let's call the incrementing byte "H0", and the three groups shown above can be H1, H2, and H3. Let's also swap their endianness. We can also take the reported watts from the Emporia app to know what value we should be expecting.
Hex Group 3
H3 is clearly related to the wattage reading because all the negative wattages start with F's, but what's up with 33.7 and 45.3? According to H3 they're supposed to be negative... and the other calculated wattages are close-but-not-equal to the reported wattage. What's going on here?
The data above were hand-picked, but I think we need to look at many in a row to help detect patterns. So, I created a quick and dirty tool. See the pattern below?
The quoted kW doesn't line up with the calculated W value, but it is offset (for the most part)! In this case, it is offset by a minute. So either my system's clock or Emporia's clock is off. Couldn't be me — Emporia's clock must be off by a minute. 😇
This also explains the 33.7 and 45.3 values having negative hex values. Given they were so relatively close to 0, the adjacent minute's reading must have been negative.
Now what about the other hex groups?
Hex Groups 1 & 2
Let's sort chronologically instead:
What pattern do you see with H1 and H2?
Both columns have sections where they don't change (see the top rows of H2 and bottom rows of H1), but does that correlate with anything? That's right, H2 doesn't change when the wattage is positive, while H1 doesn't change when the wattage is negative. They also seem to increase in value over time (with rollover).
These must be counting up the cumulative watt-hour usage. But is it Wh or kWh? Let's convert some to decimal and check:
|Watts (diff * 60)
|Watts (diff * 60)
In this table, each row is one minute after the previous row. If we take the difference between two rows (a watt-minute) and multiply it by 60, we can convert it to instantaneous watts for that minute. This should match the actual wattage reading at the time, and it does! So I can confirm that H1 is Watt-hours consumed while H2 is Watt-hours produced.
Upon seeing more data, I've concluded that H1 and H2 are actually four bytes each, not two. I will keep the original writing above but have updated the remainder of the article below.
Why didn't jrouvier's code work?
Their work was based on the returned payload from the MGM111 firmware version 2, but mine shipped with version 7. I've documented the difference in the payload here.
TL;DR, this is the V7 payload format:
Show me the code
My main interest is to have the wattage reading, so that is what I've implemented in my fork:
And it works!