Cobalt at ICASSP

Submitted By: Stan Salvador

Last week, many of us at Cobalt “attended” the ICASSP (International Conference on Acoustics, Speech, and Signal Processing) conference “in Barcelona”.  The conference is one of the premier conferences for speech recognition and signal processing.  This year the conference had four times more “attendees” than any previous year.

The quotes above are intentional … because nobody actually went to Barcelona, nobody paid to attend, and the conference was run as a virtual (no inflection this time) event.  The Covid-19 virus has caused many events in 2020 to move online, and ICASSP was no exception.  Most paper talks were pre-recorded and several Ceremonies, Keynotes, and Panels were recorded live and posted online for on-demand access afterwards.

This conference was unlike any other, and it took a heroic effort by the organizers to make it as successful as possible under challenging circumstances with rather short notice.  The conference was not converted to a virtual conference until just two months before the start date.  I’d like to thank the organizers for their efforts.  As always, the conference had many great papers presented by some very talented researchers.  However, I feel like the most interesting and relevant topic that merits some discussion is how well they were able to pull off a virtual conference.  Covid-19 related travel restrictions may still be around next year, and even if they are not, many people will likely not be willing or able to travel.  Next year’s ICASSP (as well as most other conferences) will likely be going all or partially virtual in the near future.

Cobalt Speech is a ~92% virtual company where we all work from our homes all around the world, and do nearly all of our communications over video and text chat.  The estimated “92%” is taking into account the 3 times a year (plus 1-2 times at conferences) that the entire company congregates in the same physical space at “Cobalt Workshops” (nicknamed “CoWs”).  We are constantly thinking about the advantages and disadvantages of working remotely and how to be more effective working remotely.  In that spirit, I have created a pro/con list for this ICASSP virtual conference to list some ways that I think it was an improvement over previous years, and some aspects of the live conference that I personally missed.

Pros:

  • All talks are recorded and available on-demand.  This was a fantastic addition to the conference.  It is very common for two very interesting talks to be taking place at the same time and difficult decisions must be made.  It is also common for presentations that are uninteresting to a particular person to be sandwiched between two must-attend talks.  
  • IEEE memberships may become more useful.  If presentations are regularly recorded in the future, that will create a growing library of presentations that would make IEEE memberships much more valuable than just making research papers available through membership.  Access to research papers may change significantly in the future due to the growing open access movement that is pushing free access to peer-reviewed research.
  • Virtual conferences can be “attended” by more people.  Remote conferences can be attended by people who are too busy or unable to travel long distances. We had a number of Cobalt employees participate who wouldn’t otherwise have gone this year, either because they have young children and couldn’t commit to a whole week away, or because only some of the talks were relevant for them. Prices can also be reduced due to a larger number of attendees and reduced venue costs.

Cons:

  • It’s hard to tell which sessions/talks are popular.  At conferences there is often a line of people spilling out of a room, balancing on their toes trying to read part of a slide being presented.  That is a pretty reliable indicator that a really great talk is going on nearby and that I should consider joining the queue.  This is particularly true at poster sessions where small crowds will congregate around popular posters.  Word of mouth is also useful, as I’m often intercepted by current and former colleagues in the halls who point me towards their favorite presentations/papers at the conference.  Unfortunately, I didn’t feel like I had any of this kind of feedback at this conference.  Adding some features like view counts for talks, or better ways to ask questions from presenters would improve this a bit.
  • This conference had a lack of social interaction.  I have many colleagues/friends that I reliably catch up with a couple times a year at conferences … but that was not possible this time.
  • A missed opportunity for hiring.  Cobalt Speech is a small company, and a booth at ICASSP is a great way to get in contact with some talented speech researchers who are looking for a job, or may be doing so at some point in the future.
  • Few people asked questions during the talks.  I’ve noticed that as the number of people grows in a video chat, the percentage of time that only one person is speaking with everyone else on mute also grows.  Sessions had a way to submit text questions that would be answered after the talk, but I don’t think it was very successful.  Almost all questions seemed to be asked anonymously and the uniformity of each talk having only 1-2 questions was a pretty good sign that they were mostly the “polite” questions asked by the session chair.
  • Remote conferences have some time zone difficulties for many attendees.  I struggled to attend the talks live.  Most talks happened while I would normally be sleeping due to the nine hour time difference.  Being able to watch them afterwards mostly made up for it, but didn’t feel as much like being part of things. 
  • I really miss traveling to new international locations for conferences.  ICASSP had a couple “cultural meet-ups” with “virtual tours of Barcelona” … but it’s hard to get as excited about that as being there in person.

While my “cons” list may seem a bit longer than my “pro” list, some of the positive aspects of the virtual conference are very significant, and some of the “cons” are more nice-to-have. It is amazing how effective the virtual conference was despite the challenging circumstances in which it was put together, and the way that virtual conferences are run will only improve over time.  I expect that as we start physically attending conferences again, conferences will be split between some virtual and some in-person attendance.  Hopefully, this will become an ideal solution for everyone, depending on their circumstances.  I also look forward to being able to travel again and continuing to build up my extensive collection of uniquely designed ICASSP and Interspeech proceedings-loaded USB drives (the New Orleans ICASSP 2017 USB stick shaped like a guitar is my favorite so far).

thumb drives from ICASSP

About the Author

Stan Salvador is the Chief Scientist at Cobalt and takes a lead role in the technical approach on Cobalt projects.  Prior to Cobalt, Stan was a key speech scientist at Amazon. He was at Amazon for four years, contributing directly to acoustic modeling (AM) on the Echo as well as to other Amazon speech projects like Fire TV, Dash, and Shopping.  Stan worked three years at Yap, where he was a key contributor on Yap’s voicemail recognition platform.  Stan also worked in machine learning applications at NASA, General Dynamics, and NuTech Solutions.  Stan earned an M.S. in Computer Science at the Florida Institute of Technology.

Stan Salvador