0

Reconnecting Feather WICED
Moderators: adafruit_support_bill, adafruit

Please be positive and constructive with your questions and comments.

Reconnecting Feather WICED

by jensa on Tue Apr 04, 2017 9:00 am

Sorry for posting yet again about this, but I'm having major stability issues with WICED Feather getting a full hardware hang when the access point it's connected to falls down. My former posts all contained too little info, so forgive me if this is too much info :-)

My setup uses Amazon AWS for MQTT. When the access point goes down I can see an error in my Serial monitor:
Code: Select all | TOGGLE FULL SIZE
SDEP_CMD_MQTTPUBLISH failed

and that is of course expected since there is no longer a connection to the server. What is definitely not correct is that the DisconnectCallback is never called. In 70% of the cases, the WICED Feather will hang just from the disconnection itself.

In the remaining cases (30%), the device will not crash and I get the correct response (true/false) from the call to mqtt-publish that detects the disconnection. In this case, the device will continue executing code and I can see my debug info for memory usage:

Free mem: 6752
Max mem: 10320
Free FeatherLib mem: 22628
Max FeatherLib mem: 50404


It looks to me that I'm well within bounds when it comes to memory? In these 30% of cases, the device will reconnect just fine after loosing connection.

My MQTT connection is setup like this:

In header-file:
Code: Select all | TOGGLE FULL SIZE
  #define MQTT_TX_BUFSIZE   1024
  #define MQTT_RX_BUFSIZE   1024
  AdafruitMQTT mqtt;


In cpp file:
Code: Select all | TOGGLE FULL SIZE
  mqtt.setDisconnectCallback(disconnect_mqtt_callback); // Set the disconnect callback handler
  mqtt.tlsSetIdentity(aws_private_key, local_cert, LOCAL_CERT_LEN); // Setting Indentity with AWS Private Key & Certificate
  mqtt.err_actions(true, false); // Tell the MQTT client to auto print error codes and halt on errors
  makeRandomID();
  mqtt.setBufferSize(MQTT_TX_BUFSIZE, MQTT_RX_BUFSIZE);
  bool didWeConnectToAws = mqtt.connectSSL(MQTT_HOST, MQTT_PORT);


My disconnect_mqtt_callback only contains a Serial.println statement to show that this part of the code was reached. It is never executed when a disconnection occurs.

Here's my Feather info.
Code: Select all | TOGGLE FULL SIZE
Bootloader  : 1.0.0
WICED SDK   : 3.5.2
FeatherLib  : 0.6.2
Arduino API : 0.6.2


I really need to solve this and I have offered to help with FeatherLib without response. I'm willing to spend whatever time required and dive as deep in code as required.

jensa
 
Posts: 159
Joined: Wed Dec 21, 2011 9:35 pm

Re: Reconnecting Feather WICED

by adafruit_support_mike on Wed Apr 05, 2017 11:29 pm

My best guess is that the code is calling a network function that waits for a response, but never gets one.

Try setting a watchdog timer before making the mqtt-publish call. If it times out, have that handler force a disconnect.

adafruit_support_mike
 
Posts: 44191
Joined: Thu Feb 11, 2010 2:51 pm

Re: Reconnecting Feather WICED

by jensa on Thu Apr 06, 2017 7:45 pm

I think you misunderstood what I wrote? What happens is a hardware hang. I cannot set timers or do anything.
The following Gist will show the problem so it can be debugged (but not by me, since I don't have access): https://gist.github.com/jenschr/8ff93c0 ... 0b3263c549

J

jensa
 
Posts: 159
Joined: Wed Dec 21, 2011 9:35 pm

Re: Reconnecting Feather WICED

by adafruit_support_rick on Fri Apr 07, 2017 10:10 am

It would take me a bit of time to work all the way through the maple runtime, but from what I've seen so far, I'm thinking that maybe you can't do serial prints in the callback routine. Try taking them out and see what happens.

If that doesn't help, then the watchdog as Mike suggested should work for you.

adafruit_support_rick
 
Posts: 34764
Joined: Tue Mar 15, 2011 11:42 am
Location: Buffalo, NY

Re: Reconnecting Feather WICED

by jensa on Fri Apr 07, 2017 6:02 pm

Thanks for replying Rick. I'll test that and report back.

Regarding using a watchdog - this is fairly complicated. By just running my sketch in the Gist, you'll quickly see what happens. It just locks up. Also - this is not only related to MQTT. As that example shows, this applies to network connections in general. It's thus a rather grave problem I would say? I COULD of course start a WDT before any call to anything that has to do with network access, but it's not of much help if something breaks inside if FeatherLib? I mean - there's no way for me to restore an incorrect (or correct?) state beneath the sdep class, or is it?

jensa
 
Posts: 159
Joined: Wed Dec 21, 2011 9:35 pm

Re: Reconnecting Feather WICED

by adafruit_support_rick on Sat Apr 08, 2017 1:44 pm

OK. This is mysterious. I'm going to have to escalate it. Technically, the WICED is not hanging. It's "halting", which apparently also halts the watchdog, since it doesn't fire. That alone is baffling to me.

adafruit_support_rick
 
Posts: 34764
Joined: Tue Mar 15, 2011 11:42 am
Location: Buffalo, NY

Re: Reconnecting Feather WICED

by jensa on Mon Apr 10, 2017 3:45 am

Tested and removing Serial is relevant, but not a fix. If I remove the reference to Serial, the device will not hang completely, but it will not be able to reconnect.

I can of course force it to restart using WDT, but that's not solving my problem of uptime since reconnecting to both web & AWS takes many seconds every time this occurs.

jensa
 
Posts: 159
Joined: Wed Dec 21, 2011 9:35 pm

Re: Reconnecting Feather WICED

by adafruit_support_rick on Mon Apr 10, 2017 10:20 am

I escalated this to the WICED engineers. They'll respond soon

adafruit_support_rick
 
Posts: 34764
Joined: Tue Mar 15, 2011 11:42 am
Location: Buffalo, NY

Re: Reconnecting Feather WICED

by jensa on Tue Apr 18, 2017 4:23 pm

Hi Rick,
That would be great. I'm eagerly waiting :)

jensa
 
Posts: 159
Joined: Wed Dec 21, 2011 9:35 pm

Re: Reconnecting Feather WICED

by adafruit_support_rick on Wed Apr 19, 2017 8:58 am

They are in a crunch, getting an nRF52 release out. But I bumped the request.

adafruit_support_rick
 
Posts: 34764
Joined: Tue Mar 15, 2011 11:42 am
Location: Buffalo, NY

Re: Reconnecting Feather WICED

by hathach on Fri Apr 21, 2017 3:30 am

Hi, sorry for late response. nrf52 release takes more time than expected ( happen all the time :D ).
An quick test using aws example sketch with addition of togglePin(ledPin) in the loop() indicate WICED is not hanged since LED is still blinking after the disconnection. However, as you say the disconnect callback does not trigger. Looking into this, will post if I could find any update.

UPDATE: seems like in case of AP disconnection, Broadcom sdk/stack does not call TCP disconnect callback as It should. Therefore we couldn't clean up the data/status and an reconnection would not happen. Diving deeper in the stack code.

hathach
 
Posts: 476
Joined: Tue Apr 23, 2013 1:02 am

Re: Reconnecting Feather WICED

by jensa on Fri Apr 21, 2017 4:41 am

Hi @hathatch,
Please check out the example I made to reproduce the bug easily:
https://gist.github.com/jenschr/8ff93c0 ... 0b3263c549

As mentioned before - if possible I'd love to help out with FeatherLib.

J

jensa
 
Posts: 159
Joined: Wed Dec 21, 2011 9:35 pm

Re: Reconnecting Feather WICED

by hathach on Fri Apr 21, 2017 4:56 am

Ah thanks, I switch to test with your sketch now :D

We would love to, but the strict licence from Broadcom bar us to do anything like that. That's is why we "invent" SDEP thing and try to keep most of highlevel application on the Arduino/user side.

Code: Select all | TOGGLE FULL SIZE
/*
 * Copyright 2015, Broadcom Corporation
 * All Rights Reserved.
 *
 * This is UNPUBLISHED PROPRIETARY SOURCE CODE of Broadcom Corporation;
 * the contents of this file may not be disclosed to third parties, copied
 * or duplicated in any form, in whole or in part, without the prior
 * written permission of Broadcom Corporation.
 */

hathach
 
Posts: 476
Joined: Tue Apr 23, 2013 1:02 am

Re: Reconnecting Feather WICED

by jensa on Fri Apr 21, 2017 6:09 am

Ok. I'm sure I read somewhere on either Github or here that if one had a signed NDA with Broadcom that it was possible?

jensa
 
Posts: 159
Joined: Wed Dec 21, 2011 9:35 pm

Re: Reconnecting Feather WICED

by hathach on Fri Apr 21, 2017 11:15 am

For the licence, I have no ideas.
Regarding the bug, it is part of a greater crime though. Even an dns resolve won't work after Android hotspot toggle and reconnection. Although it can connect and print out the network details. It not well cleaned up in the system.

Code: Select all | TOGGLE FULL SIZE
/*********************************************************************
 This is an example for our Feather WIFI modules

 Pick one up today in the adafruit shop!

 Adafruit invests time and resources providing this open source code,
 please support Adafruit and open-source hardware by purchasing
 products from Adafruit!

 MIT license, check LICENSE for more information
 All text above, and the splash screen below must be included in
 any redistribution
*********************************************************************/

#include "adafruit_feather.h"

/* This example demonstrates how to use the getHostByName function
 * to lookup an IP for a hostname. A string representation
 * of an IP can also be used directly.
 */

#define WLAN_SSID            "yourSSID"
#define WLAN_PASS            "yourPassword"

// target by hostname
const char target_hostname[] = "adafruit.com";

void disconnect_callback(void)
{
  Serial.println("disconnect_callback(): Disconnected");
}

/**************************************************************************/
/*!
    @brief  The setup function runs once when reset the board
*/
/**************************************************************************/
void setup()
{
  Serial.begin(115200);

  // Wait for the Serial Monitor to open
  while (!Serial)
  {
    /* Delay required to avoid RTOS task switching problems */
    delay(1);
  }

  Serial.println("GetHostByName Example\r\n");

  // Print all software versions
  Feather.printVersions();

  // Set disconnection callback
  Feather.setDisconnectCallback(disconnect_callback);

  while ( !connectAP() )
  {
    delay(500); // delay between each attempt
  }
}

/**************************************************************************/
/*!
    @brief  The loop function runs over and over again forever
*/
/**************************************************************************/
void loop()
{
  if ( Feather.connected() )
  {
    // Resolve and ping hostname
    IPAddress ipaddr;

    ipaddr = Feather.hostByName(target_hostname);
    Serial.print(target_hostname);
    Serial.print(" -> ");
    Serial.println(ipaddr);

    Serial.println("Try again in 5 seconds");
    Serial.println();
    delay(5000);
  }else
  {
    Serial.println("Reconnecting ...");
    while ( !connectAP() )
    {
      delay(500); // delay between each attempt
    }   
  }
}

/**************************************************************************/
/*!
    @brief  Connect to defined Access Point
*/
/**************************************************************************/
bool connectAP(void)
{
  // Attempt to connect to an AP
  Serial.print("Please wait while connecting to: '" WLAN_SSID "' ... ");

  if ( Feather.connect(WLAN_SSID, WLAN_PASS) )
  {
    Serial.println("Connected!");
  }
  else
  {
    Serial.printf("Failed! %s (%d)", Feather.errstr(), Feather.errnum());
    Serial.println();
  }
  Serial.println();

  // Connected: Print network info
  Feather.printNetwork();

  return Feather.connected();
}

hathach
 
Posts: 476
Joined: Tue Apr 23, 2013 1:02 am

Please be positive and constructive with your questions and comments.