• Disseminate: The Computer Science Research Podcast

  • 著者: Jack Waudby
  • ポッドキャスト

Disseminate: The Computer Science Research Podcast

著者: Jack Waudby
  • サマリー

  • This podcast features interviews with Computer Science researchers. Hosted by Dr. Jack Waudby researchers are interviewed, highlighting the problem(s) they tackled, solutions they developed, and how their findings can be applied in practice. This podcast is for industry practitioners, researchers, and students, aims to further narrow the gap between research and practice, and to generally make awesome Computer Science research more accessible. We have 2 types of episode: (i) Cutting Edge (red/blue logo) where we talk to researchers about their latest work, and (ii) High Impact (gold/silver logo) where we talk to researchers about their influential work.


    You can support the show through Buy Me a Coffee. A donation of $3 will help us keep making you awesome Computer Science research podcasts.


    Hosted on Acast. See acast.com/privacy for more information.

    Jack Waudby
    続きを読む 一部表示

あらすじ・解説

This podcast features interviews with Computer Science researchers. Hosted by Dr. Jack Waudby researchers are interviewed, highlighting the problem(s) they tackled, solutions they developed, and how their findings can be applied in practice. This podcast is for industry practitioners, researchers, and students, aims to further narrow the gap between research and practice, and to generally make awesome Computer Science research more accessible. We have 2 types of episode: (i) Cutting Edge (red/blue logo) where we talk to researchers about their latest work, and (ii) High Impact (gold/silver logo) where we talk to researchers about their influential work.


You can support the show through Buy Me a Coffee. A donation of $3 will help us keep making you awesome Computer Science research podcasts.


Hosted on Acast. See acast.com/privacy for more information.

Jack Waudby
エピソード
  • Liana Patel | ACORN: Performant and Predicate-Agnostic Hybrid Search | #60
    2024/11/11

    In this episode, we chat with with Liana Patel to discuss ACORN, a groundbreaking method for hybrid search in applications using mixed-modality data. As more systems require simultaneous access to embedded images, text, video, and structured data, traditional search methods struggle to maintain efficiency and flexibility. Liana explains how ACORN, leveraging Hierarchical Navigable Small Worlds (HNSW), enables efficient, predicate-agnostic searches by introducing innovative predicate subgraph traversal. This allows ACORN to outperform existing methods significantly, supporting complex query semantics and achieving 2–1,000 times higher throughput on diverse datasets. Tune in to learn more!


    Links:

    • ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and Structured Data [SIGMOD'24]
    • Liana's LinkedIn
    • Liana's X

    Hosted on Acast. See acast.com/privacy for more information.

    続きを読む 一部表示
    53 分
  • High Impact in Databases with... David Maier
    2024/11/04

    In this High Impact episode we talk to David Maier.


    David is the Maseeh Professor Emeritus of Emerging Technologies at Portland State University. Tune in to hear David's story and learn about some of his most impactful work.


    The podcast is proudly sponsored by Pometry the developers behind Raphtory, the open source temporal graph analytics engine for Python and Rust.


    You can find David on:

    • Homepage
    • Google Scholar

    Hosted on Acast. See acast.com/privacy for more information.

    続きを読む 一部表示
    1 時間 2 分
  • Raunak Shah | R2D2: Reducing Redundancy and Duplication in Data Lakes | #59
    2024/10/28

    In this episode, Raunak Shah joins us to discuss the critical issue of data redundancy in enterprise data lakes, which can lead to soaring storage and maintenance costs. Raunak highlights how large-scale data environments, ranging from terabytes to petabytes, often contain duplicate and redundant datasets that are difficult to manage. He introduces the concept of "dataset containment" and explains its significance in identifying and reducing redundancy at the table level in these massive data lakes—an area where there has been little prior work.


    Raunak then dives into the details of R2D2, a novel three-step hierarchical pipeline designed to efficiently tackle dataset containment. By utilizing schema containment graphs, statistical min-max pruning, and content-level pruning, R2D2 progressively reduces the search space to pinpoint redundant data. Raunak also discusses how the system, implemented on platforms like Azure Databricks and AWS, offers significant improvements over existing methods, processing TB-scale data lakes in just a few hours with high accuracy. He concludes with a discussion on how R2D2 optimally balances storage savings and performance by identifying datasets that can be deleted and reconstructed on demand, providing valuable insights for enterprises aiming to streamline their data management strategies.


    Materials:

    • SIGMOD'24 Paper - R2D2: Reducing Redundancy and Duplication in Data Lakes
    • ICDE'24 - Towards Optimizing Storage Costs in the Cloud



    Hosted on Acast. See acast.com/privacy for more information.

    続きを読む 一部表示
    31 分

Disseminate: The Computer Science Research Podcastに寄せられたリスナーの声

カスタマーレビュー:以下のタブを選択することで、他のサイトのレビューをご覧になれます。