Extended Intelligences
Last updated
Last updated
The extended intelligences class is a big theoretical and practical introduction to understanding AI systems and how they are composed between their' structures to make us designers, and creators, in general, take advantage of these technologies more consciously and ethically by implementing them in the real world. A workshop composed of two weeks, one specifically directed to incorporate knowledge about AI and the social, cultural, environmental, and economic implications, and the second one directed to the tools, skills, and technologies to be used.
During the first week, the teacher Andres Colmenares guided us through the history of AI since its origin, and most importantly its concepts and philosophical reflections that conceived the knowledge we have nowadays about the topic and how our contemporary time is approaching new tools in a massive but not conscious ethical way, most of the times. The main crucial example since the beginning was "language", and how we use it to communicate and understand the world around us considering that words carry more than a convenient way of expressing our perception of reality.
Artificial Intelligence is a metaphysical representation of synthesis.
As a consequence, artificial intelligence models carry within words our culture, way of living, and most importantly, our preconceptions above us and others in a context when words already have a meaning in a specific moment when those tools are invented. This process represents a mirror of our society and communities, where some are more prioritized than others, a closed representation reflected in an illustration and metaphor called Mechanical Turk, a man controlling another man (machine) playing chess, as a repetitive cycle of replication and automation inside of its aims.
Besides those philosophical implications and reflections, it was important to understand that other layers of AI models today have different kinds of implications and interferences besides our human societal structure, as the environmental conditions those systems affect to stay awake. The answer and representations are massive industries controlled by big tech companies that need huge infrastructural systems to run amounts of data which use enormous quantities of energy, water, minerals, electricity, heat, and labor work to balance and maintain the machine working to achieve a minimal viable product for the market where is being proposed to be used.
10 million seconds or 10 million dollars?
Within the consequences and implications of new AI models around the world, such as Google Gemini, Apple Intelligence, Chat GPT, Stable Difusion, and others, we could have a clearer vision of the integration and transition from physical to digital actions and reactions by developing those tools, that even if developed more efficiently, can't stop the entropy of changes in the physical reality we live in. At last, considering this amount of information incorporated, Andres presented examples from art collectives, and design studios that questioned how these systems were being assembled, driving us to understand the critical design factors of using and developing systems that integrate a certain kind of intelligence to the products of our daily lives.
This approach drove us to a more Speculative Design approach with schemes, graphics, and projects developed by Super Flux and the Near Future Laboratory, giving more clarity on the project we would be working on for the second week. Within this, lots of expositive classes, and even a guest from Brazil talking about the relation we have with time and how humans are always rushing against it, we were proposed to imagine what project we would like to develop for IAAC students in twenty years within a Solarpunk scenario. As a response, my group proposed an AI system that could help students to disassemble/assemble different kinds of tech products.
The second week held by Pau Artigas had a more practical approach to AI, explaining the details, structures, and how those systems work within the code and algorithms that are being placed and used in different kinds of interfaces and platforms. Some of the lectures emphasized and created bridges between what was being told by Andres and how we would make a difference after reaching a certain level of maturity by implementing AI models in the class. However one of the most interesting aspects approached by Pau in his first lectures was specific artistic references that used AI as a concept materialized physically and not necessarily as a digital model or interface, extracting the essence of the machine.
Within those examples, we could understand more about how AI was organized and better know the specific terminologies that should be used when incorporating these kinds of tools into our world, such as Datasets, Deep Learning, Machine Learning, Models, Training AI, and how each of these technologies are translated into digital actions around the industry as maintaining the philosophical languages approached in our previously week, as a mix of knowledge and speculative practices.
AI as an expression of current ideology;
AI as an infrastructure;
AI as a character/topic/concept to think with;
AI as an ubiquitous technology.
In sequence, Pau asked us to collect which kind of datasets, huge quantities of classified, identified, and categorized information could adjust to the aims we have for our master project direction using research tools he showed to us during the class, such as Hugging Face, Kaggle and Papers with Code, which I've searched for genetic parts that could compose a plasmid to assemble new models of bacteria. However, even if not connected directly with our proposal from last week, it was important to experiment and understand which keywords should be used to know how to research.
To continue our group project, this time we needed to polish our project and understand what could be improved and even changed to make our idea more feasible and adapted to the reality available in those dataset platforms within an explanation about how these informational channels would work by schemes and infographics that made sense for the connections needed to make to achieve a specific output. In addition, before continuing the project and getting hands dirty, Pau still mentioned Google Collab and how it could be used to integrate already-made models to translate our inputs to different outputs images as text and images using Python language.
Gianna is an advanced artificial intelligence assistant designed to support IAAC students in identifying and disassembling objects. Inspired by the concept of Jarvis from "Iron Man," it uses a network of advanced sensors, image-recognition cameras, and sound technology embedded across the IAAC building for continuous monitoring and real-time interaction. With these technologies, the assistant not only recognizes objects but also analyzes their structure, providing precise disassembly instructions and pointing out the location of necessary tools within the IAAC.
Gianna is capable of learning and adapting based on data gathered from user interactions, becoming more accurate and efficient over time. The system integrates image recognition, natural language processing, and advanced monitoring technologies, offering comprehensive support for students. With its hyperconnected infrastructure and advanced algorithms, Gianna becomes an indispensable assistant, enabling students to learn more easily and solve problems effectively in IAAC’s dynamic environment.
How does the Gianna prototype work?
Object Identification & AI Analysis: The user takes a photo of an object, which Gianna analyzes to identify it and determine the best way to disassemble it.
Disassembly Instructions Generation: Gianna creates detailed disassembly instructions, enhanced with visual diagrams.
Tool Recommendations: Gianna suggests the appropriate tools needed for disassembling the object.
Tool Location on the IAAC Campus: Gianna informs the user where to find the necessary tools within the IAAC campus.
Workflow of the prototype:
Bias and ethics issues
Continuous monitoring of students and real-time sound analysis raises concerns about privacy and consent. User data collected by Gianna, such as photos, video, voice inputs, and user locations, must be handled with strict privacy controls to prevent improper use or unauthorized sharing.
In terms of dependence vs. autonomy, there is a potential risk that students may become overly reliant on AI for various projects and tool usage, which could limit learning and result in thoughtless trust in the assistant without fundamental understanding. This may lead to reduced personal exploration, stagnation of creativity, and a loss of independent thinking.
Regarding responsibility, tool creators carry significant responsibility for ensuring the precision and reliability of the AI’s actions. Incorrect advice due to AI errors can lead to unintended consequences, such as equipment damage or harm to a student. Transparency in explaining how Gianna arrives at its conclusions and recommendations is also crucial, enabling users to make informed and responsible decisions when using the system.
Information of at least one dataset related to your task/s
In our research on datasets for this project, we encountered challenges with data that was either too general to meet our specific needs or too precise, making it difficult to apply practically in the context of Gianna at IAAC. After analyzing various options, the best solution for our prototype turned out to be using existing models that come with built-in datasets, providing the right balance of precision and flexibility for further system development.
Below, as part of the documentation, we include a table with datasets from our research, although they were not used in the Gianna prototype:
Information of at least one API/tool/neural network related to your task/s
For the models, we used 4 types for the different stages of the Gianna system. LLaVA-13B API is used for chatting with images to receive instructions. Meta-LLaMA-3-8B: a language model designed for generating chat completions. IKEA Instructions Lora SDXL: a text-to-image model that generates visuals in the style of IKEA instruction manuals.
Each tool serves a unique purpose within a broader interactive or creative system, combining text, visual, and language-based tasks effectively.
Simple code demo
https://colab.research.google.com/drive/1bYgwaorBmrFAxEP4z-Q8b5HJ3sF2AxAh?usp=sharing
What have you learned doing it?
- Discovery of datasets and operation of models, understanding of AI basics
- How to integrate models and use Python in Google Colab
- Ethical issues, privacy, balance between dependence on AI and autonomy
As mentioned in the beginning, the Extended Intelligences class was a big introduction to AI systems how they are being used today the ethical implications of using these kinds of technologies, and how they would change our behavior and way of perceiving the world around us considering that most of the times they are being used to save time and automate actions to increased productivity in a system where not everyone has access. However, even before the class I stayed grounded with the statement of understanding this new possibility as only tools, as digital platforms that try to replicate the human brain process of cognition and synthesis, but I would emphasize the perception of AI technologies as a character/topic to think with and not exactly use in it, to enhance the character of the tool as an attribution to your work and not necessarily as a substitute.
My fear of AI doesn't necessarily remain only through the physical implications of those tools within our environment as something that could be solved through certain policies, laws, and guidelines of ethical bias, maybe even new kinds of architectural structures could be developed to enhance space, collect resources and even feed the huge amount of machines. My fear truly belongs where the knowledge that has been placed there can drive a more stable interaction between our language and how we create new or change old terminologies to new meanings that can be erased by the use of a common collective intelligence that uses operated to attended, most of the times, fast and temporary necessities commanded by the "common" people. A complicated step to the philosophy of language or maybe even a new territory to discover what we still don't know about ourselves.
In the end, my question remains the same, how much intelligence and synthesis capacity do humans have to give to their creations, and how much of that is necessary?
NAME OF DATASET
LINK
SCREW-DATASET
METRIC-SCREW-IMAGE-CLASSIFICATION
NOTEBOOK DATASET
MICROCONTROLLER DETECTION
SCREW DATASETS
BACTERIA DATASET
APPLE PRODUCTS DATASET
MODEL
USAGE
chat with an image for instruction on how to disassemble
language model for chat completion
text to image (ikea instruction style)
https://replicate.com/meta/meta-llama-3-8b-instruct/api from "Dreams" code
informs the user where to find the necessary tools