Tagged: modeling

Peluang kecelakaan Satu di antara Sejuta: Jika 999.999 selamat dan Anda orang ke-sejuta, Masihkah Anda Melanjukan?

Seorang sahabat pernah menceletukkan pikiran isengnya yang sangat legitimate untuk dipertanyakan: misalnya kecelakaan suatu maskapai penerbangan adalah 1 dibanding 1.000.000. Seandainya kita tahu bahwa 999.999 penerbangan sebelumnya selamat, dan kita akan memasuki penerbangan yang ke 1.000.000. Beranikah kita masuk ke dalam pesawat itu? Pastikah terjadi kecelakaan di penerbangan tersebut?

Ini adalah pertanyaan yang sangat menarik! Jika ternyata penerbangan itu tidak mengalami kecelakaan, bukankah ini berarti bahwa peluang kecelakaannya tidak lagi satu dibanding sejuta (karena dari 1 juta penerbangan, tidak ada satupun yang kecelakaan)? Lalu jika memang “selalu” ada 1 dari sejuta penerbangan yang mengalami kecelakaan, apa itu berarti penerbangan ke-sejuta kita pasti kecelakaan?

Untuk meluruskan kebingungan ini, kita perlu mendefinisikan ulang apa makna peluang suatu kejadian dan distribusinya. Pertama-tama, perlu kita pahami bahwa biasanya, statistik yang diberikan media tentang suatu kejadian (katakanlah kecelakaan pesawat) adalah estimasi berdasarkan data historis. Artinya, nilai proporsi ini bukan nilai yang sesungguhnya, tapi dengan selang kepercayaan tertentu, kita dapat yakin estimasi ini cukup “dekat” dengan proporsi dalam dunia nyata. Lebih jauh lagi, nilai estimasi ini sangat bergantung pada data historis yang dimiliki sebelumnya. Hal ini akan semakin jelas setelah kita diskusikan bagian berikut.

Untuk memahami fenomena ini, kita akan memakai model yang jauh lebih umum dan lebih mudah untuk dianalisis. Misalkan ada n (pilih n cukup besar nanti, minimal sejuta lah) buah apel dalam keranjang dan hanya ada 1 buah yang busuk di antaranya. Maka, dengan cepat kita dapat simpulkan bahwa ketika kita ambil sebuah apel secara acak, peluang kita peroleh apel busuk adalah satu dibanding n. Perhatikan bahwa nilai peluang dalam kasus ini adalah nilai eksak dan bukan estimasi. Peluang mendapatkan apel busuk adalah betul-betul 1/n. Lalu jika kita lakukan pengambilan buah secara acak (dan setelah diambil, buah dikembalikan lagi) sebanyak n-1 kali dan selalu diperoleh apel yang baik, berapa peluang apel ke sejuta busuk? Perhatikan bahwa jika semua pengambilan selalu disertai dengan pengembalian, maka pemilihan buah dalam suatu percobaan tidak bergantung pada pemilihan buah dalam percobaan yang lain. Dalam bahasa statistik, setiap pengambilan buah saling bebas satu sama lain. Dengan demikian, jika kita diberi informasi bahwa n-1 pengambilan pertama tidak memberikan apel busuk, informasi itu tidak memberi pengaruh apapun dalam perhitungan peluang apel ke-sejuta. Dengan kata lain, peluang busuknya apel kesejuta sama dengan peluang busuknya apel pertama, yang tidak lain adalah 1/n. Anehkah? Jika kita tetapkan n=6, maka kasus ini akan sama dengan kasus melempar dadu. Peluang kita peroleh angka 1 adalah 1/6. Lalu, jika kita melempar dadu sebanyak 5 kali dan kita tidak peroleh angka 1, berapa peluang dadu keenam bermata 1? Tentu 1/6 bukan, sama seperti pelemparan yang lain? Ya, tidak ada yang aneh dari semua perhitungan ini.

Namun demikian, kita dapat kembali mempertanyakan hal ini dengan cara yang berbeda. Bukankah peluang pengambilan apel busuk satu dibanding n berarti bahwa dalam n kali pengambilan, ada satu diantaranya yang busuk? Jadi apa hal ini menjamin dalam n kali pengambilan, pasti ada 1 apel busuk? Jawabannya adalah TIDAK. Kita akan buktikan secara matematis. Dalam sekali pengambilan, peluang terambil apel sehat adalah \left(1-\dfrac{1}{n}\right). Maka, peluang tidak ditemukan apel busuk dalam n kali pengambilan adalah \left(1-\dfrac{1}{n}\right)^n, yang konvergen menuju 1/e\approx 0.3679 ketika n\rightarrow \infty. Perhatikan bahwa nilai ini berbeda dengan nilai peluang yang kita analisis di paragraf sebelumnya (di sini, tidak ada syarat/kondisi yang diberikan). Jadi, perhitungan kita mengatakan bahwa TIDAK PASTI bahwa selalu ditemukan satu apel busuk dalam n kali pengambilan; lebih tepatnya kejadian ini terjadi dengan peluang sekitar 36.79%, suatu angka yang lebih kecil dari 100%.

Sekarang, mari kembali ke cerita pesawat kita. Dalam kasus ini, nilai peluang satu dibanding sejuta bukanlah nilai peluang secara eksak, karena kita tidak bisa membuktikan secara matematis bahwa proporsi kecelakaan pesawat secara apriori memang betul-betul tepat 1 dibanding sejuta. Lalu bagaimana kita dapat mengklaim nilai peluang ini? Kemungkinan besar nilai ini diperoleh dari estimatornya, yang didapat dari pengolahan data sebelumnya. Tentu perhitungan estimator ini mengasumsikan distribusi terjadinya kecelakaan saat ini sama dengan di waktu yang lampau dan kejadian kecelakaan setiap penerbangan saling bebas satu sama lain. Tentu karena perhitungan kita adalah estimasi, maka kita akan selalu memiliki error. Menariknya, kejadian yang terjadi saat ini dapat mengubah nilai estimator menjadi lebih akurat, padahal di saat yang sama nilai estimator tersebut dalam prakteknya digunakan untuk memprediksi realisasi kejadian tersebut. Dengan kata lain, ketika seseorang dapat menyimpulkan peluang kecelakaan adalah 1 banding sejuta, maka sebetulnya sudah ada cukup banyak penerbangan yang sudah diambil data sampelnya (minimal sejuta). Dan sebaliknya, jika kita hanya mengetahui data 999.999 penerbangan saja, kita belum dapat menyimpulkan bahwa estimator peluang kecelakaan adalah 1 banding sejuta. Dengan demikian, cerita yang menjadi pokok permasalahan kita kurang realistis dan mengasumsikan proposisi yang kurang tepat.

Namun demikian, sekalipun kita dapat memperoleh nilai peluang secara eksak (seperti dalam cerita apel busuk), toh BELUM PASTI bahwa selalui ada sebuah kecelakaan dalam sejuta kali penerbangan. Jadi, kepastian yang kita peroleh bahwa penerbangan ke-sejuta tidak pasti mengalami kecelakaan, datang dari dua hal: suku error dari estimator kita dan perhitungan peluang teoretik kita. Dengan demikian, ketika Anda tahu bahwa Anda adalah penumpang ke-sejuta, jangan panik dan buang-buang tiket Anda. Anda hanya perlu melihat angka statistik ini dari perspektif lain dan menyadarinya bahwa itu hanya tipuan belaka. 🙂

MCF – MMC ITB 2014

Mathematical Challenge Festival (MCF) Institut Teknologi Bandung merupakan acara dwitahunan yang diselenggarakan oleh Himpunan Mahasiswa Matematika (HIMATIKA) ITB sejak tahun 2002. Dalam perkembangannya, MCF ITB telah diadakan sebanyak lima kali. 

MCF ITB merupakan suatu ajang dimana seluruh siswa – siswi SMA dari seluruh Indonesia bersama-sama merayakan matematika sebagai ilmu dasar yang mewarnai kehidupan mereka. Pada acara ini, setiap peserta MCF diberi kesempatan untuk menikmati matematika bersama-sama, dalam suatu rangkaian acara yang koheren. 

 

Mathematics Modeling Competition (MMC) Institut Teknologi Bandung merupakan program kerja perdana yang dirilisoleh FMIPA ITB, Kelompok Keahlian Matematika Industri dan Keuangan (KK MIK). Acara ini merupakan suatu kontes pemodelan matematika pertama di Indonesia yang dibuka untuk tingkat mahasiswa. Bersamaan dengan MCF ITB, melalui acara ini diharapkan setiap peserta mahasiswa juga bersama – sama merayakan matematika dalam suatu ajang pemodelan matematika.

 

Di sini, akan ada Kompetisi Pemodelan Matematika tingkat Siswa dan juga Mahasiswa. Untuk tingkat Mahasiswa, ini merupakan kompetisi pemodelan matematika PERTAMA di Indonesia!

Sign up now! http://www.math.itb.ac.id/mcf-mmc Pendaftaran terbatas! 

#IndonesiaBermatematika

ICM 2013 – Network Model of Earth’s Health

This is my first time joining a mathematical modeling contest. As long as I knew, student’s mathematics contest should be held in a few hour, while the participants are working individually, in a close & silent room, doing some hard problems in several hours. However, this time is totally different. MCM/ICM 2013 := Mathematics/Interdisciplinary Contest in Modeling, 1-4 February 2013. This is my first 4 x 24 hrs nonstop mathematics competition. In these 144 hrs, we’re ‘locked up’ in a mathematics laboraty, just doing research on a problem, which is initially doesnt seem like a mathematics problem.

Well, it started on 2 Feb actually (officially it’s 1st, but in WIB, the time would be 8 am in 2nd feb). Me, with my 2 friends, Kevin and Nita, faced 3 problem, of which we should choose 1 of them to be modeled. We chose the problem C, which is also an ICM problem. It is “Network Model of Earth’s Health”. Sort of environmental and biological global issue. Well, the problem said that our mother earth is being more likely to be unpredictable. Many last biological forecastings fail to predict today’s phenomena. Of course, we all know that we could not anymore say that our earth is in a good condition. Natural disasters, varied global disease, extreme poverty & food scarcity, higher carbondioxyde emission, etc. Well, although most of the phenomena might be forecasted in many years ago, the impact of the phenomena is extremely beyond our expectation. We need to develop a more complex model to get a better forecasting. The important aspect, which everybody didnt care at that time is the presence of local factors. They just considered global factors which can affect the earth’s health. In our present, we can’t just consider those things only. We need to model the whole big picture. Every local factors count! Every detail of local’s properties of earth’s health aspect is directly influencing the global scale state shift! So, why is it so important? Why don’t we just go with flow and enjoy our mother nature? Holy crap! It’s just the same condition as you’re about crossing a very old bridge that you can not determine whether you can accros that safely or not, then you just said that ‘Why should we worry about this? Just go with the flow. Is that a problem if the bridge is going to be down?’. To make a proper follow up to the future possible natural phenomena, we should reduce the variety of biological surprise. That’s the point of global biological and ecological global forecasting!

Next, the ICM problem setters give us many many many many papers to read. 4 days to do research! The first day, we’ve just read, read, read, read, and read. Biology paper, environment paper, ecology paper, economic paper, mathematics paper, etc. The first day was the not-so-exciting part of the research. Well, the ICM asked us to make a dynamic network model on this problem. Dynamic, then the ‘time’ property should be the one of the most important dimention of our model. Network, then we should construct some nodes and assign some links to them in a appropriate type of relation. In day one, at least we found 24 variables involving in the attemp to measure the global earth’s health. TWENTY FOUR VARIABLES. Of course, those variables are absolutely not independent. 24 varibales, then at most, we can find \binom{24}{2} = 276 kinds of relation between them. That’s quite insane right? So, in our first attemp, we choose just about 10 variables involving in our model, some of these are: food availability, water availability, wood availability, disaster regulation, cultural site, GDP, carbon emission on atmosphere. Well then, although we’ve made a very complicated relation between one another, we found that our model was not a network at all. All of the country (which act as the nodes) were independent of every other country. The health of a nation was then not determined by other nation’s condition. That’s a big failure. Our first failure. Day 1 was end with no prospective model at all.

Day two. We started the day with our new perspective in our model. Also, we did data mining again in the morning. In the middle of the day, we were end in a conclusion: you can’t model those 24 variables in just 4 days. You may if it were 4 months or 4 years, however that was not possible. Then, instead of stand still in our 10-variables model, we choose another model. 1 variable model. Well, this is embarassing, but this is the only one which we’re could overcome. Well then, we choose that 1 variable: food supply of each country. Of course this variable is the local factors of each nations. Food suply in country A can not be the same as in country B. However, other factors which may be involved in our model are political and economic stability in each country, and also strenghtness of bilateral political relation between 2 arbitrary nations. Ok, in a food flow, each country should have a production factor, consumption factor, export factor, and import factor. These factor varied among country, and moreover the rate of each were also varying extremely. This is our second attempt. Well, we were so excited with this model, at least at that time T__T. Next, we started to gather the data of each country’s food production, consumption, and international trading. This is really not an easy job. Gathering a data of a district may be an easy job, but to gather the data of 20 country, and the existing bilateral relationship between them is a really really exhausting job. In the end of the day, we still didnt get the entire data for each nations. However, the embrio of our model had been existed at that time. A dynamic directed graph. We choose 24 countries around the world to be the vertices, representing our whole mother nature, and we said that country A had a flow to country B if and only if there is a food export deal from country A to country B. That was our Day 2.

Day 3. It’s Sunday. While our other friends were enjoying a relaxing day, we didn’t. Our morning research began with collecting data (again). Gathering data about which country having certain amount of food export to another country is really really hard to do. We just found the complete statistics of United States. We did not find any data about other country’s food trading. That’s suck. Next, we decided to just collect the data of wheat and rice production, consumption, export, and import. Just that. Thus, it would be another simplification of our model. Okay, just forget about the data. That’s the not-so-interesting part of our research, Now, we’re gonna talk about the model. As I’ve mentioned above, our model was a dynamic digraph. We built a weighted adjacency matrix of our graph, let say matrix K. Then each component value, k_{ij} means that country i have a k_{ij} amount of wheat and rice commodity to be exported to country j. Next, as I’ve also mentioned, we also have a production and consumption vector, representing each country’s. Then all of those structure is then influenced by the economic and political condition of each country and of every 2 countries (bilateral). Also it is influenced by the scientific research on food production of each country. Each of those conditions are then modeled as stochastic process. We assign some random variables to each of those factors, and also we choose an appropriate distribution to each of them. Determining the proper distribution was also a really hard work. We had to deal with some distribution transformation, regarding to some mode and mean changes; and this stuff is a little bit complicated. Later, we model the consumption of each country is following the population of the country. The higher population, the higher food demand would be. Well then, we model the population growth as a simple differential equation (also known as population growth model / radioactive decay model) introduced in Calculus. The very simple one, the naive one, and the one which later destruct our whole model. In this model, our equation give an unlimited growth of consumption, an exponential rate of human population. This allow our model to give a higher consumption than the production rate. This is our second failure. In our simulation, we projected that most of developed country such as United States, Russia, Japan, China, etc would be down in 2020. The lack of food supply. We thought that our model was absolutely not feasible to be applied in a serious biological fotecasting.

Day 4. It’s monday. Many other friends were having lecture & we’re just staying still on laboratory. Miserable. Okay, in this state, we’re just writing our research report. Making a document, a paper-like document to be submitted to the committee of the contest. Nothing special here. Just several hours spent in front of laptop, typing and typing. At the end of the day, we did not change our failure of the model. We submitted the exponential consumption model. That’s all.

Not a great model. Not an amazing result. No sophisticated solution / conclusion. However, we were happy, enjoying those days. Personally, these events may could affect my whole life ahead. Thanks Lord for the providence You’ve given to us. Thanks for the opportunity. Thanks for the days.

20130209-121441.jpg