SIAM News Blog
SIAM News
Print

MDS22 Panel Addresses the Complex Interplay Between Academia and Industry

By Lina Sorg

When considering future career paths, junior scientists may view academia and industry as completely separate spaces with their own independent interests, purposes, and goals. While this mindset was widely accepted in the past, the current workforce certainly benefits from a closer relationship between the two settings — each of which can support the other in valuable ways. During a panel session at the 2022 SIAM Conference on Mathematics of Data Science (MDS22)—which took place in San Diego, Calif., in September—three industrial researchers shared their experiences at their respective companies, addressed industry’s interplay with academia, and offered advice to attendees who intend to pursue careers in industry. Eldad Haber (University of British Columbia) moderated the hour-long panel, which comprised Dawn Woodard (LinkedIn), Wotao Yin (Alibaba Group/Academy for Discovery, Adventure, Momentum and Outlook), and Shashanka Ubaru (IBM Research).

The panelists opened the session with a conversation about the differences between academia and industry, as well as the advantages of an ongoing dialogue between the two. “Academia goes deep, and industry makes things skatable in a business sense,” Yin said, adding that assignments in industry move at a much quicker pace. Woodard agreed and commented that academic researchers tend to explore one problem very thoroughly over a long period of time, whereas industrial researchers typically address multiple problems simultaneously and on much shorter timescales. This pattern means that the breadth of problems in industry is larger than academia, though the depth is not as great. When Woodard first transitioned to industry from a faculty position at Cornell University, the interconnectivity of the problems was an unexpected change. “It’s exciting that you can spit out the solution in one or two months,” she said. “This was mind-blowing for me coming from academia, where it took years just to get data.”

Yet because industry projects generally focus on applications of a specific product, answering fundamental research questions is not a priority. In some cases, scientists simply tweak existing techniques rather than develop novel ones. “Someone in academia has to be working on fundamental, basic questions,” Ubaru said. “A good collaboration would really help that.” He conceptualized a relationship wherein industry researchers pose questions that academics could subsequently spend substantial time exploring. Woodard concurred. “Figuring out how to share the data would be incredibly powerful,” she said.

Haber then asked the panelists to envision future projects that would pique their interests. Woodard, who spent seven years with Uber before joining LinkedIn in May 2022, wants to see a reinforcement learning solution for routing in road networks. Meanwhile, Yin would like to create optimization solvers for data science experts and Ubaru—who is a member of the Mathematics of AI group at IBM—hopes to explore the potential use of quantum on a practical research basis. When Haber inquired as to whether companies are willing to put in the necessary overhead that would allow industry-academia collaborations to pursue such projects, panelists agreed that interest would depend on the size of both the prospective company and university. “The appetite for this at small and medium-sized companies is much lower,” Woodard said. “The idea of funding a university for research is a lot harder to sell.”

A panel of industry researchers at the 2022 SIAM Conference on Mathematics of Data Science, which took place in San Diego, Calif., this September, addressed the complex yet mutually beneficial relationship between academia and industry. From left to right: moderator Eldad Haber (University of British Columbia) engages panelists Wotao Yin (Alibaba Group/Academy for Discovery, Adventure, Momentum and Outlook), Shashanka Ubaru (IBM Research), and Dawn Woodard (LinkedIn) in conversation. SIAM photo.
Nevertheless, Woodard noted that certain initiatives at LinkedIn currently fund projects in the Department of Computer Science at Cornell University. Ubaru referenced IBM’s fellowship programs, which provide clearly defined funding and objectives for professors to work on specific topics of interest at IBM for a year, as well as several joint programs with New York University and fellowships that allow Ph.D. students to gain industry experience. And Yin remarked that Alibaba—which is an e-commerce platform—houses a specific department that focuses on public affairs and cooperation, allocates a budget to every lab, and supports potential collaborators. The company has also sponsored visiting professorships and scholarships.

Discussion then turned to the nuances of intellectual property (IP) in the context of collaboration. Yin stated that some IP officers suggest avoiding joint IPs in favor of solo IPs or open-source licenses, though Woodard revealed that Uber does hold IP agreements with other institutions. “If you’re working on a problem that’s very closely related to something that we really need, that’s the obvious use case,” she said. “Pick something that is close to the fold for them, and you’ll immediately generate interest.” Conversations with colleagues in a variety of focus areas can provide researchers with a sense of relevant problems, which heightens their chances of developing collaborations with different companies. In some cases, there may be numerous options for partnerships regarding a particular idea; in the context of ridesharing, for example, a project that does not appeal to Uber might intrigue Lyft or another similar service.

Next, the panelists offered advice for job applicants who wish to effectively market themselves for industry positions. Woodard affirmed the value of a broad set of practical data-related skills, solid code, a demonstrated interest in data, and a willingness to “get your hands dirty.” She also warned attendees not to overcomplicate problems, as companies emphasize practical and efficient solutions. More personal qualities—including motivation, enthusiasm about the organization, and a demonstrated passion for building a successful product—are equally important.

Ubaru added that IBM seeks highly motivated people with robust programming and data science skills, the ability to deploy artificial intelligence (AI) technology on a large scale, experience with deployment on large data sets, and strong publication records. Woodard clarified that publication records are particularly important for research-heavy roles. If someone has just completed their Ph.D., employers will likely look for physical evidence of their productivity. “Since you’ve been working really hard for five years on one project, I want to see that you’ve made progress on it in a really tangible way,” she said.

When an attendee asked the panelists to describe a sample project and timeline from their current positions, Woodard explained that some tasks focus on incremental improvements while others tackle long-term goals and future technologies. For instance, she recently spent a month rethinking LinkedIn’s knowledge graph and laying out the company’s vision for the next several years.

As with LinkedIn, the nature of IBM’s activities varies widely depending on the objective. Ubaru outlined four different types of projects at IBM. Client projects have clear deadlines and fixed timelines, while exploratory science research can continue for several years without previously defined end goals or objectives. Work towards specific AI products occurs in the form of year- to year-and-a-half-long challenges with multiple stages and quarterly progress reports. Finally, grant-based efforts generally demand more rigid allocation and logging of employee work hours.

Another audience member inquired about data sharing between industry and the public sector — a topic that relates to issues of privacy and sensitivity as well as the data’s worth to the company in question. “There’s a value associated with data,” Woodard said. “So if you release all of the data to the public, you lose that value. But there’s a willingness to share data for something that’s not core proprietary technology, like traffic data at Uber.” Doing so ultimately allows researchers to develop better systems. Yet because many companies gain a competitive advantage from their proprietary data, they are often more willing to share example data sets at smaller volumes.

Ubaru noted that most of IBM’s data comes from its clients and is therefore confidential. Since much of it is sensitive—like banking information, for instance—IBM does not really possess proprietary data. For companies that do, however, Ubaru’s thoughts echoed those of Woodard. “Companies feel a sense of requirement that they need to share data, as long as it doesn’t affect their bottom line,” he said.

As the session wound down, one attendee wondered whether the panelists had observed a lack of certain types of skills or experiences among job applicants. “Academic curricula tend to default to being more theoretical and don’t include some of the aspects around data analysis, machine learning models, and the quality of data that go into the models,” Woodard said. She thus encouraged professors to include techniques like instrumentation, logging, and data collection in their lesson plans so that students learn how to identify and analyze high-quality data.

To support this recommendation, Woodard presented a sample scenario from her employment with Uber. If ridership is low and prices are high, analysts must determine the problem within the system. “Breaking down the problem using data and figuring out the root cause is a practical thing that’s really important,” Woodard said. Though Woodard and her colleagues usually train junior hires on these skills when they arrive, she acknowledged that it would be beneficial if early-career researchers practiced some of those techniques as students.

Yin pointed out that some of the best textbooks in academia contain outdated algorithms. As such, he would like to see a future collaboration between industry and academia to produce a linear programming textbook that suits everyone’s needs. He also mentioned that the U.S. has very few mixed-integer experts, despite the numerous real-world problems that require this particular skillset.

The panel ended with several remarks about the value and utility of mathematics in the industry sector. “Many [MDS22] minisymposia are applicable to things we do at IBM,” Ubaru said. “Math goes without saying; it’s the first thing we need.” Yin supported this sentiment and added that roughly 30 percent of the Alibaba Group’s recent hires hold Ph.D.s in either mathematics or applied mathematics.

Since Woodard uses techniques that span multiple different fields—including statistics and economics—she emphasized that researchers must understand all of the probabilistic foundations of their applications, especially when extending a method or creating a reusable platform. A well-rounded education allows one to do so. “That’s the beauty of SIAM,” Haber said as he concluded the session. “It brings together skills from lots of areas, so you don’t have to say that techniques belong only to one field.”

Lina Sorg is the managing editor of SIAM News.  
blog comments powered by Disqus